Guinness wrote:
I think it would be kind of useful to get a frequency distribution for each note played in the ITM repertoire, from (G,) to (d,) to justify getting certain keys on the flute (low C, low C#, Eb, Fnat long, Fnat short, G#, Bb and Cnat) given that each of these keys can cost hundreds of dollars each.
(This is quoted from a posting several years ago...)
Irish music seems to have a very formal structure, so I've
decided to write some programs to examine a corpus of Irish music to
see what structures there are beyond the obvious tune & turn. I
thought this forum might enjoy seeing some preliminary results.
MUSIC CORPUS
This work is based on a collection of approximately 775 reels
available from *Henri Norbek*.
OTHER NOTES
For consistency (and because the tinwhistle is typically played in D),
all music was transposed into the key of D. I did not typically write
out the F and C sharps.
Further, all tunes (and note patterns) were normalized into a base
octave. That is, if I found two note patterns:
abc
ABC
I recognized them as the same pattern by "normalizing" the pattern in
the higher octave into the base octave.
RESULTS
NOTE FREQUENCIES
The simplest feature to look at is the frequency various tones occur
in the music. With songs transposed into D, these are the relative
frequencies of the notes:
23223 D (21%) Unison
18780 A (17%) 5th
18750 F# (17%) 3rd
18333 E (17%) 2nd
14105 B (13%) 6th
8841 G (8%) 4th
7780 C# (7%) 7th
The relative paucity of 4ths and 7ths suggests that many of the reels
may be written in a pentatonic scale, but this turns out not to be the
case. Only 14 of the 779 reels completely avoid the 4th and 7th. I'd
be interested if anyone has an explanation for this distribution.
It's also interesting to look at the relative frequencies of the notes
with duration:
19483 D
17497 F
16871 A
16362 E
13154 B
8066 G
7536 C
2876 D2
1639 A2
1510 E2
971 F2
796 D3
764 B2
606 G2
445 E3
282 F3
264 A3
185 B3
167 G3
152 C2
92 C3
68 D4
16 E4
6 A4
2 G4
2 B4
There is a strong dominance of quarter notes.
A common rule of musical composition is to end the song on the tonic
note. The following table shows the notes most commonly used to end
songs in the corpus (after transposing all songs into Dmaj):
196 D
135 E
122 A
113 B
97 F
58 G
52 C
Interestingly, although D is the most common resolution note, it is
not an overwhelming favorite. One reason for this may be that some of
the tunes are modal. Another reason may be that many Irish dance
tunes do not properly "resolve". This makes the playing of a "set" of
tunes easy, and musicians are expected to add their own resolution
when ending a set.
The durations of ending notes is shown in the following table:
563 1/4
177 2/4
25 4/4
14 3/4
Examining the resolution of the songs suggests looking at the opening
note of the song as well:
190 D
176 A
128 E
123 F
102 B
36 G
23 C
442 1/4
226 2/4
111 3/4
Opening notes demonstrate a more even distribution, although it's
interesting that not a single tune begins with a whole note.
INTERVAL FREQUENCIES
Next I looked at the interval from one note to the next (in whole
tones). A step up of a whole tone is written as +1; down as -1; thus
+1 corresponds to an ascending 2nd, while -3 corresponds to a
descending 4th. Here are the top eleven intervals:
24544 +1
21874 -1
14173 -2
9460 +2
9239 +0
5982 -9
5098 +5
5035 +6
4904 -6
3613 -4
3601 +8
3227 +3
3038 -5
2915 +9
2610 +4
2013 -3
1861 -8
[...]
574 +7
214 -7
Most (61%) intervals are 2nds, 3rds, or unisons. Of the simple
intervals, eighths (octaves) are very seldom seen, and to a lesser
extent fourths and fifths.
NOTE SEQUENCES
Although the distribution of tones and intervals is interesting, it is
more interesting to ask if there are common musical "phrases" in Irish
music. There are several different ways to define phrasing, but for
the moment let's consider the simplest case: any sequence of notes
within a song.
From a composition viewpoint, note sequences aren't very useful
because they can cross phrasing, bar and repeat boundaries. From a
performance viewpoint they are more interesting, because they identify
the most common sequences or runs a musician will have to play to
perform the corpus.
PLAYING COMMON SEQUENCES
In general, we could find the most common note sequences by leaving
all songs in their original key, extract sequences notated in C, and
retain the original octaves. For example: For reels in the key of D,
these are the ten most common 4 note sequences:
291 dBAF#
271 def#d
257 dc#AG
256 edc#A
231 c#AGE
230 ef#de
221 Bc#de
217 f#edB
207 f#edc#
205 def#g
Any musician could use this list to identify common note sequences
needed to play the corpus.
However, as a tinwhistle player I'm really more interested in the
common *fingering* sequences. In addition to being diatonic, the
tinwhistle has the interesting property that tones an octave apart
have the same fingering.
The D tinwhistle is the traditional choice for Celtic music, and is
commonly used for tunes in Dmaj or Gmaj. Analyzing the corpus of D
and G reels, and normalizing into a base octave to collapse similar
fingerings produces this list of common 2 note sequences:
3801 DE
3688 ED
3350 AG
3344 EF
3153 FD
3093 FG
3081 AB
2906 GE
2871 GA
2862 BA
All told there are 80,000 2 note sequences in this part of the corpus.
There are about 155 unique "fingerings", and the top ten fingerings
comprise 40% of the sequences. Note that these top ten fingerings
also reflect the common distribution of intervals (primarily 2nds and
3rds).
This short "song" includes the top ten sequences:
X:1
T:Practice Fingerings for D Tinwhistle
M:C|
K:D
FGAB AGEF | DED2
As it turns out, maintaining octave and duration information only
changes the top ten fingerings slightly (replacing GA and FG with dB
and AF).
INTERVAL SEQUENCES
The previous analysis looked at *note* sequences. We can also examine
the intervals between sequences of notes. For example, the four most
popular 2 note sequences (DE, ED, AG, and EF) are all intervals of
seconds. That is, the second note in the sequence is one whole tone
(a second) above or below the first.
Here are the ten most common intervals:
24544 +1 (ascending 2nd)
21874 -1 (descending 2nd)
14173 -2 (descending 3rd)
9460 +2 (etc.)
9239 +0
5982 -9
5098 +5
5035 +6
4904 -6
3613 -4
The five simplest intervals make up about 60% of the corpus.
MUSICAL PHRASES
Let's now consider a more sophisticated definition of a musical
phrase:
A phrase is a sequence of notes that begins on
an emphasis beat and ends just before an emphasis beat.
In 4/4 time, emphasis is on the first and third beats of a measure, so
a phrase could start on the first beat and end after the second beat,
start on the first beat and end after the fourth beat, start on the
third beat and end after the fourth beat, etc.
Consider this reel from the corpus:
X:7
T:For the Sakes of Old Decency
R:reel
D:Chieftains Live.
D:Michael Tubridy: The Eagle's Whistle.
Z
hn-reel-7
M:C|
K:G
d2BG AGEG|DGBG A2AB|d2BG AGEG|1 DGAG EGAB:|2 DGAG EG~G2||
|:~G3B d2Bd|eaag eg~g2|~G3B d2Bd|1 dega bged:|2 dega bage||
In the first bar of this reel we could select "d2BG", "d2BG AGEG" or
"AGEG" as phrases. All these repeat in the third bar. We would not
want to select (for example) "GAG" as a phrase from the first bar.
Although it repeats in several places, it violates our common sense
notion of a musical phrase by starting in the middle of a beat and
ending after an emphasis beat.
To begin with, we looked at two beat phrases that begin on an emphasis
beat. (In the first bar of the reel above, these would be "d2BG"
"AGEG"). The corpus contains about 35,000 phrases of this sort, of
which 3662 are unique. (The average phrase appears about 10 times in
the corpus; 1439 phrases appear only once.) The top ten most common
phrases are:
507 dBAF
340 FDD2
273 edBA
261 D2FD
259 edBd
243 DEFD
241 ABde
233 AFF2
224 fedB
209 EFGE
These represent about 7% of the total phrases in the corpus. The most
popular phrase, dBAF, is far more common than any other phrase. It's
also interesting to compare this list to the most common two note
sequences above.
We can perform a similar analysis on intervals, to find the most
common interval sequences that make up a two beat phrase:
883 -2+0
707 +1
575 +1+1-2
533 -4+0
532 +2-2
435 +1+1+1
416 +2-1-1
413 +2+0
397 -1+0
372 -1
Note that because we're measuring these phrases in beats, they can
have differing numbers of notes. The most common interval pattern,
+2+0, matches (among many other sequences) the second most common note
sequence: FDD2.
The second most common interval pattern, +1, (usually) represents a
dotted quarter note followed by an eighth note. This pattern is
surprisingly common, as in e.g.,
X: 46
T:Galtee Ranger, The
T:Humours of Galteemore, The
T:Callaghan's
R:reel
Z
hn-reel-46
M:4/4
K:D
|:AF~F2 FEDE|~F3E F2dB|AF~F2 FEDE|1 FBBA FEEF :|2 F2EG FDD2||
~A3B AF~F2|ABde fe~e2|fedc BcdB|ABde fedB|
~A3B AGFG|ABde fe~e2|fedc BcdB|ABde fedB||
where it appears as A3B twice in the turn. The similar pattern -1
appears in the second bar as F3E.
Next we looked at four beat phrases. We would expect far fewer
"common" phrases at this length. And indeed, the average four beat
phrase appears only about one and a half times in the corpus -- versus
almost ten times for two beat phrases. (15,000 four beat phrases
appear only once.) The top ten most common are:
42 d2fdAdfd
37 D2FDADFD
29 G2BGEFGE
28 DFAFBFAF
28 A2FADAFA
27 dBAFDEFA
27 AFF2ABde
26 edBcdBAF
24 FAA2BFAF
23 dBAFFEE2
Note that the top two phrases differ in only one note (ignoring the
octave shift).
Let's return to two beat phrases. Are some phrases particularly
common to begin a bar? To answer this question, we found all the
phrases that began on the first beat of a measure. Here are the top
ten:
298 dBAF
232 D2FD
170 AFF2
163 Beed
149 DEFD
149 Add2
144 FDD2
140 ABde
135 edBA
129 FAA2
Three new phrases sneak into this list: Beed, Add2 and FAA2. Five
phrases in this list have doubled notes, versus only two in the list
of phrases starting on all emphasis beats.
How about phrases that end a bar? Here are the top ten:
222 edBd
209 dBAF
196 FDD2
176 BAFA
159 EFGE
158 edBc
141 fedB
138 edBA
136 BcdB
116 BFAF
Four new phrases show up: BAFA, edBc, BcdB, BFAF. Two of these use C,
the most uncommon tone in the corpus. Note also that edBd, edBc and
edBA all appear in this list.
Let's extend the idea of "phrases that end a bar" a bit and look at
phrases that end a "passage" -- where a passage is a portion of the
song ending in repeat bars or at the end of the song. There are about
3000 "passages" in the corpus. Here are the top ten passage-ending
phrases:
147 FDD2
79 FEE2
79 D2DE
57 BAA2
56 D4
48 FDDE
46 D3E
43 EFGE
43 DEFG
38 d2dB
Amazingly, FDD2 accounts for more that 5% of the phrase endings in the
corpus. Altogether, the top ten phrases account for over 20% of the
endings. (D4 appears only 68 times in the corpus, but 56 times as the
ending phrase of a passage.) Repeated notes (FDD2, FEE2, BAA2) are
popular phrase endings.