Electrostatic loudspeaker


Final solution

Segmented ESL simulation
ESL Impedance simulation
Bass speaker
Frequency measurements


In the winter 2014/2015 I build a set of electrostatic loudspeakers together with my friend John Rathleff.

It is a hybrid speaker with an electrostatic element to cover the midrange and up (300 Hz – 20 kHz) and a traditional dynamic loudspeaker to cover the bas (20 Hz – 300 Hz).

Designing and building an electrostatic speaker is a challenge, but as a dedicated DIY’er it is possible to achieve very good results.

If you want to build an electrostatic speaker, you are faced with a number of practical challenges like how to stretch and coat the diaphragm, how to glue it, how to insulate it and how to achieve the high voltages needed. Information on these subjects are available in several cookbooks and is not included here.

In the following, I will assume the reader has some knowledge on the subject and I will focus on the specific design that we have implemented.

The electrostatic principle

Electrostatic principleAn electrostatic loudspeaker (ESL) is a speaker design based on the electrostatic principle.

A charge Q is distributed on the diapgragm and an electrical field E is applied across the stators. This generates a force F on the diaphragm where:


The electric field is generated from the audio signal via step up trafos and the charge is generated via a high voltage supply.

If the electric field is homogeneous and the static charge is distributed evenly on the diaphragm, then you will have a force that is directly proportional to the sound with minimal distortion.

Electrostatic charges are week so in oder to reach realistic sound pressure you will need high voltages.

Diaphragm and charge

We use a thin and strong diaphragm (0.004 mm) that is coated with a substance that gives a very high impedance. The diaphragm is charged to 5,730 V through a 30 M Ohm resistor.

The high impedances means that charge moves very slowly, this is important beacuse it keeps the charge evenly distributed.

The diaphragm acts as a capacitor and with the high impedances, we get very long time constants. It takes some seconds to charge the diaphragm and much longer for the charge to disappear so do not touch anything before you have discharged the diaphragm through a resistor.

The diaphragm is stretched and the movements are very small therefor the diaphragm is positioned in the middle where the electrical field is homogeneous.


The stator must be open (50% openness is fine) so the sound can pass through. This can easily be achieved either with a perforated plate or with a wire stator. We use an insolated wire stator.

On the net you may find discussions about multi stranded, single stranded or solid rods as stators. The argument against stranded wire is that the stator must be stiff to reduce stator vibration. Wee disagree.

We use a stranded wire stator, which is easier to work with and gives excellent results. But how much do they vibrate?

With a total wire length of 150 m per ESL element we get a Cu weight around 75 g and the weight of the diaphragm is around 1 g. This weight relation is important.

When an electrostatic force is applied to the diaphragm, the same force is applied to the stator but in the opposite direction. If the diaphragm moves out the stator will move in.

If the stator is totally flexible we have a simple situation where we can apply Newton’s second law:


 In abstract terms, we have two bodies (the diaphragm and the stator) with masses m1 and m2 and accelerations a1 and a2. When the bodies see the same force, we get:


which can be rearranged to:

a1/a2 = m2/m1

The acceleration as well as the movement of the two bodies depends on the weight relations which is 1:75 in our case. If the diaphragm moves 1 mm then the stator will move 1/75 mm or close to zero.

Point source versus Line source

Traditional dynamic speakers are point sources. That means that the sound radiates as if it came from a single point in space.

ESL speakers are often line sources. That means that the sound radiates as if it came from a vertical line.

Here are the major differences between a point source and a line source.

The point source:

  1. Sound from the point source radiates in all directions where it is reflected from all room boundaries.
  2. Normally the diaphragm area of a point source is rather limited and to achieve high SPL the point source has to make high excursions leading to high distortion.
  3. The SPL of the direct sound falls with 6 dB per doubling of the distance. Unless you are very close to the speaker, most of the sound will come from reflections not from the speaker.

The line source:

  1.  Sound from the line source will be limited above and below the line source thereby limiting reflections from ceiling and floor.
  2. The diaphragm area of a line source can be large and it can produce high SPL with limited excursions leading to almost no distortion.
  3. The SPL of the direct sound falls with 3 dB per doubling of the distance. Even if you are not very close to the speaker, you will receive most of the sound directly from the speaker not from reflections.

All in all we find line sources very attractive.

Open baffle

ESL speakers are normally of open baffle type that radiate sound into front as well as back. It has at least two consequences.

  1. The front wave and the back wave tends to cancel at least below a certain frequency. This is a clear disadvantage that requires compensation.
  2. The radiation pattern looks like a figure 8 with no energy radiated to the sides. This has the advantage that sidewall reflections are reduced.

Open baffle systems can be complicated and require advanced frequency compensation circuits but if done correctly they should be superior. Sigfried Linkwitz is a noticed proponent of open baffle.

In our ESL solution, we use open baffle with advanced compensation via stator segmentation.

Segmented line source

Classic segmented line source

Classic segmentation (front view)

Classic segmentation (front view)

In the classic segmented line source you will see two or three segments.

Each  segment takes care of a specific frequency range and the width of each segment is optimized for the frequency range that the segment is handling.

A crossover splits the audio signal into separate frequency bands for each segment.

This solution has the same the disadvantages as you will find in traditional two or three way speakers designs.

Advanced segmented line source

Advanced segmentation (top view)


Holland has an ESL club where I first met a special version of the segmented line source developed by Edo Hulsebos. This is our primary source of inspiration.

It is based on a wire stator where wires are collected into segments. Each segment has a capacitance C proportional to the number of wires and each segment receives the music through a resistor network. With the right combination of segments and resistor values it is possible to control the frequency response and the dispersion and with 5-7 segments it will behave as a perfect line source for all frequencies.

Edo Hulsebos has develop a program that can simulate the loudspeaker.

We have designed a hybrid speaker where the ESL element covers the range 300 Hz – 20,000 Hz.

Here is a deeper explanation of our segmented line source:

  1. The highest frequencies are only sent to the middle segment. With 6 wires and a distance of 3 mm per wire, we have a middle segment that is 1.8 cm wide. This narrow strip gives very fine dispersion at the highest frequencies.
  2. With lower frequencies, the two neighboring segments will begin to emit sound together with the middle segment. This effectively increases the width of the speaker to include three segments with 6+2*6=18 wires or 5.4 cm. If we get the right balance between the width (5.4 cm) and the frequency (from RC) we can have the same dispersion as we achieved at the higher frequency.
    In reality, we have a speaker that behaves as a line source that continuously gets wider and wider with lower frequencies resulting in a constant directivity. Fantastic.
  3. Traditional electrostatic speakers need frequency compensation that raise lower frequencies with 3-6 dB per octave. This is not necessary with this speaker because the radiating area grows with low frequencies resulting in flat frequency response.
  4. The RC circuits also result in a slight delay of the signals to the outer segments. The signal will start at the middle segment and then move outwards towards the outer segments.
    If you imagine that the sound is radiated from a line positioned behind the diaphragm and the sound moves towards the diaphragm with the speed of sound then the diaphragm will have the exact same reaction starting from the middle and growing to the sides.
    In reality, we have a speaker that behaves exactly as if sound was radiated from a perfect line positioned behind the speaker.
  5. The area of the speaker is so large that it can achieve high SPL with very small diaphragm movements. That means that the speaker is free of all distortions except for possible distortion from the step up trafo.
  6. ESl speakers are said to be superior in reproducing aperiodic signals because of the low weight of the diaphragm. I think that it is the low weight combined with a large area that is important, i.e. weight per area (g/cm2).
    Our ESL diaphragm weighs around 1 gram and covers an area of 2250 cm2 or 0.00045 g/cm2.
    The diaphragm of a traditional woofer (Scan Speak 26W4558T00) weighs 105 g and has a piston area of 352 cm2 giving 0.30 g/cm2. (670 times more than ESL)
    The diaphragm of a traditional tweeter (Scan Speak D2905/970000) weighs 0.45 g and has a piston area of 8.5 cm2 giving 0.053 g/cm2 (120 times more than ESL).
  7. Air is quite heavy (1.2 kg/m3 at 20 degrees C) or 0.12g/cm3.
    If we look at the weight of the air in a thickness of 1 cm on each side of the diaphragm, we find that the air weighs 0.25g and the diaphragm weighs 0.00045g. One cm of air on each side weighs 555 times more than the diaphragm – incredible.
    The weight of the air dominates totally and because air is so heavy and so reluctant to move, we will get superior damping from the air. We think this is the physical explanation behind the superior sound quality of the ESL speaker.
  8. The size of the ESL element is limited for esthetic reasons. It works as a perfect line source from 300 Hz to 20,000 Hz and has none of the problems associated with traditional crossovers in this important frequency domain.
  9. Below 300 Hz we use a dynamic woofer.
  10. ESL speakers are capacitive loads and together with the step up transformer this may create serious problems. A trafo with a step up ratio of n acts a an impedance transformer with an impedance ratio of n*n, in our case 40,000 from the secondary to the primary  side !!!
    The total capacitance of all segments are 400 pF or 16 uF on primary side which is a load that few amplifiers are able to handle, but due to segmentation and ladder resistors our load is much easier.
    At 20,000 Hz our amplifier will only see the middle segment which is 30 pF or 1.17 uF on the primary side and this capacitor will be in series with a ladder resistor of 82k (2 ohm on primary side).
    The net result of segmentation and ladder resistors is a load that is primarily resistive between 6 and 8 ohm.

Next page: Specifications