# (Project) Project NewickVector

## Introduction

Project NewickVector is the successor of Project TwoDigitNewick, with the goal of supporting Newicks of any arity.

Project NewickVector has been developed by (sorted alphabetically on first name):

Project NewickVector started in February 2011.

• Procedure
• Same results for binary phylogenies
• Validating results for phylogenies of any arity

## Procedure

The code of Project NewickVector is compared to the previous projects in result and speed.

All measurements have been performed on the same computer. In all simulations, the value of parameter theta was set to 10.0.

## Same results for binary phylogenies

In a previous projects, 162 different phylogenies, with complexities from 24 to 866052, were used for comparison, valididating .

These same phylogenies are used to test that ToolTwoDigitNewick (this project) calculates the same probabilities as ProjectRampal_EndVersion2.

## Validating results for phylogenies of any arity

The probabilities calculated for trinary phylogenies are validated from probabilities calculated for validated binary phylogenies. The calculation is shown and generated by computer.

Note that these phylogenies can be written down in different ways, for example '(1,(2,3))' is the same phylogeny as '((3,2),1)'. In the simulation at startup, all the results below are checked in all their possible notations.

### #1: (1,1,(1,1))

 ``` N = the phylogeny = (1,1,(1,1)) t = theta = 10 p(N,t) = probability = SUM(c_i * p_i) c(N,t) = coefficient D(N,t) = denominator = 12 N'1 = (2,(1,1)) N'2 = (2,(1,1)) N'3 = (1,1,2) N'4 = (1,1,2) For t = 10: p'1 = 0.019425019425019428 (calculated at once with NewickVector) p'2 = 0.019425019425019428 (calculated at once with NewickVector) p'3 = 0.058275058275058272 (calculated at once with NewickVector) p'4 = 0.058275058275058272 (calculated at once with NewickVector) c'1 = t / d = 10 / 12 = 0.83333333333333337 (as f equals 1) c'2 = t / d = 10 / 12 = 0.83333333333333337 (as f equals 1) c'3 = t / d = 10 / 12 = 0.83333333333333337 (as f equals 1) c'4 = t / d = 10 / 12 = 0.83333333333333337 (as f equals 1) p(N,t) = SUM(c_i * p_i)        =   ( 0.83333333333333337 * 0.019425019425019428 )          + ( 0.83333333333333337 * 0.019425019425019428 )          + ( 0.83333333333333337 * 0.058275058275058272 )          + ( 0.83333333333333337 * 0.058275058275058272 )        = 0.12950012950012951 (hand-calculated)        = 0.12950012950012951 (calculated at once by NewickVector) ```

### #2: (1,(1,1,1))

 ``` N = the phylogeny = (1,(1,1,1)) t = theta = 10 p(N,t) = probability = SUM(c_i * p_i) c(N,t) = coefficient D(N,t) = denominator = 12 N'1 = (1,(2,1)) N'2 = (1,(1,2)) N'3 = (1,(2,1)) N'4 = (1,(1,2)) N'5 = (1,(1,2)) N'6 = (1,(2,1)) For t = 10: p'1 = 0.019425019425019428 (calculated at once with NewickVector) p'2 = 0.019425019425019428 (calculated at once with NewickVector) p'3 = 0.019425019425019428 (calculated at once with NewickVector) p'4 = 0.019425019425019428 (calculated at once with NewickVector) p'5 = 0.019425019425019428 (calculated at once with NewickVector) p'6 = 0.019425019425019428 (calculated at once with NewickVector) c'1 = t / d = 10 / 12 = 0.83333333333333337 (as f equals 1) c'2 = t / d = 10 / 12 = 0.83333333333333337 (as f equals 1) c'3 = t / d = 10 / 12 = 0.83333333333333337 (as f equals 1) c'4 = t / d = 10 / 12 = 0.83333333333333337 (as f equals 1) c'5 = t / d = 10 / 12 = 0.83333333333333337 (as f equals 1) c'6 = t / d = 10 / 12 = 0.83333333333333337 (as f equals 1) p(N,t) = SUM(c_i * p_i)        =   ( 0.83333333333333337 * 0.019425019425019428 )          + ( 0.83333333333333337 * 0.019425019425019428 )          + ( 0.83333333333333337 * 0.019425019425019428 )          + ( 0.83333333333333337 * 0.019425019425019428 )          + ( 0.83333333333333337 * 0.019425019425019428 )          + ( 0.83333333333333337 * 0.019425019425019428 )        = 0.097125097125097148 (hand-calculated)        = 0.097125097125097148 (calculated at once by NewickVector) ```

### #3: (1,1,(1,2))

 ``` N = the phylogeny = (1,1,(1,2)) t = theta = 10 p(N,t) = probability = SUM(c_i * p_i) c(N,t) = coefficient D(N,t) = denominator = 40 N'1 = (2,(1,2)) N'2 = (2,(1,2)) N'3 = (1,1,3) N'4 = (1,1,(1,1)) For t = 10: p'1 = 0.0014337514337514339 (calculated at once with NewickVector) p'2 = 0.0014337514337514339 (calculated at once with NewickVector) p'3 = 0.0083250083250083241 (calculated at once with NewickVector) p'4 = 0.12950012950012951 (calculated at once with NewickVector) c'1 = t / d = 10 / 40 = 0.25 (as f equals 1) c'2 = t / d = 10 / 40 = 0.25 (as f equals 1) c'3 = t / d = 10 / 40 = 0.25 (as f equals 1) c'4 = (f*(f-1)) / d = 2 / 40 = 0.050000000000000003 (as f equals 2) p(N,t) = SUM(c_i * p_i)        =   ( 0.25 * 0.0014337514337514339 )          + ( 0.25 * 0.0014337514337514339 )          + ( 0.25 * 0.0083250083250083241 )          + ( 0.050000000000000003 * 0.12950012950012951 )        = 0.0092731342731342745 (hand-calculated)        = 0.0092731342731342745 (calculated at once by NewickVector) ```

### #4: (1,1,(1,1,1))

 ``` N = the phylogeny = (1,1,(1,1,1)) t = theta = 10 p(N,t) = probability = SUM(c_i * p_i) c(N,t) = coefficient D(N,t) = denominator = 20 N'1 = (2,(1,1,1)) N'2 = (2,(1,1,1)) N'3 = (1,1,(2,1)) N'4 = (1,1,(1,2)) N'5 = (1,1,(2,1)) N'6 = (1,1,(1,2)) N'7 = (1,1,(1,2)) N'8 = (1,1,(2,1)) For t = 10: p'1 = 0.0070068820068820096 (calculated at once with NewickVector) p'2 = 0.0070068820068820096 (calculated at once with NewickVector) p'3 = 0.0092731342731342745 (calculated at once with NewickVector) p'4 = 0.0092731342731342745 (calculated at once with NewickVector) p'5 = 0.0092731342731342745 (calculated at once with NewickVector) p'6 = 0.0092731342731342745 (calculated at once with NewickVector) p'7 = 0.0092731342731342745 (calculated at once with NewickVector) p'8 = 0.0092731342731342745 (calculated at once with NewickVector) c'1 = t / d = 10 / 20 = 0.5 (as f equals 1) c'2 = t / d = 10 / 20 = 0.5 (as f equals 1) c'3 = t / d = 10 / 20 = 0.5 (as f equals 1) c'4 = t / d = 10 / 20 = 0.5 (as f equals 1) c'5 = t / d = 10 / 20 = 0.5 (as f equals 1) c'6 = t / d = 10 / 20 = 0.5 (as f equals 1) c'7 = t / d = 10 / 20 = 0.5 (as f equals 1) c'8 = t / d = 10 / 20 = 0.5 (as f equals 1) p(N,t) = SUM(c_i * p_i)        =   ( 0.5 * 0.0070068820068820096 )          + ( 0.5 * 0.0070068820068820096 )          + ( 0.5 * 0.0092731342731342745 )          + ( 0.5 * 0.0092731342731342745 )          + ( 0.5 * 0.0092731342731342745 )          + ( 0.5 * 0.0092731342731342745 )          + ( 0.5 * 0.0092731342731342745 )          + ( 0.5 * 0.0092731342731342745 )        = 0.034826284826284831 (hand-calculated)        = 0.034826284826284831 (calculated at once by NewickVector) ```

### #5: (2,(1,1,1))

 ``` N = the phylogeny = (2,(1,1,1)) t = theta = 10 p(N,t) = probability = SUM(c_i * p_i) c(N,t) = coefficient D(N,t) = denominator = 40 N'1 = (1,(1,1,1)) N'2 = (2,(2,1)) N'3 = (2,(1,2)) N'4 = (2,(2,1)) N'5 = (2,(1,2)) N'6 = (2,(1,2)) N'7 = (2,(2,1)) For t = 10: p'1 = 0.097125097125097148 (calculated at once with NewickVector) p'2 = 0.0014337514337514339 (calculated at once with NewickVector) p'3 = 0.0014337514337514339 (calculated at once with NewickVector) p'4 = 0.0014337514337514339 (calculated at once with NewickVector) p'5 = 0.0014337514337514339 (calculated at once with NewickVector) p'6 = 0.0014337514337514339 (calculated at once with NewickVector) p'7 = 0.0014337514337514339 (calculated at once with NewickVector) c'1 = (f*(f-1)) / d = 2 / 40 = 0.050000000000000003 (as f equals 2) c'2 = t / d = 10 / 40 = 0.25 (as f equals 1) c'3 = t / d = 10 / 40 = 0.25 (as f equals 1) c'4 = t / d = 10 / 40 = 0.25 (as f equals 1) c'5 = t / d = 10 / 40 = 0.25 (as f equals 1) c'6 = t / d = 10 / 40 = 0.25 (as f equals 1) c'7 = t / d = 10 / 40 = 0.25 (as f equals 1) p(N,t) = SUM(c_i * p_i)        =   ( 0.050000000000000003 * 0.097125097125097148 )          + ( 0.25 * 0.0014337514337514339 )          + ( 0.25 * 0.0014337514337514339 )          + ( 0.25 * 0.0014337514337514339 )          + ( 0.25 * 0.0014337514337514339 )          + ( 0.25 * 0.0014337514337514339 )          + ( 0.25 * 0.0014337514337514339 )        = 0.0070068820068820096 (hand-calculated)        = 0.0070068820068820096 (calculated at once by NewickVector) ```

### 6#: (1,1,1,(1,1))

 ``` N = the phylogeny = (1,1,1,(1,1)) t = theta = 10 p(N,t) = probability = SUM(c_i * p_i) c(N,t) = coefficient D(N,t) = denominator = 20 N'1 = (2,1,(1,1)) N'2 = (1,2,(1,1)) N'3 = (2,1,(1,1)) N'4 = (1,2,(1,1)) N'5 = (1,2,(1,1)) N'6 = (2,1,(1,1)) N'7 = (1,1,1,2) N'8 = (1,1,1,2) For t = 10: p'1 = 0.0092222592222592232 (calculated at once with NewickVector) p'2 = 0.0092222592222592232 (calculated at once with NewickVector) p'3 = 0.0092222592222592232 (calculated at once with NewickVector) p'4 = 0.0092222592222592232 (calculated at once with NewickVector) p'5 = 0.0092222592222592232 (calculated at once with NewickVector) p'6 = 0.0092222592222592232 (calculated at once with NewickVector) p'7 = 0.041625041625041624 (calculated at once with NewickVector) p'8 = 0.041625041625041624 (calculated at once with NewickVector) c'1 = t / d = 10 / 20 = 0.5 (as f equals 1) c'2 = t / d = 10 / 20 = 0.5 (as f equals 1) c'3 = t / d = 10 / 20 = 0.5 (as f equals 1) c'4 = t / d = 10 / 20 = 0.5 (as f equals 1) c'5 = t / d = 10 / 20 = 0.5 (as f equals 1) c'6 = t / d = 10 / 20 = 0.5 (as f equals 1) c'7 = t / d = 10 / 20 = 0.5 (as f equals 1) c'8 = t / d = 10 / 20 = 0.5 (as f equals 1) p(N,t) = SUM(c_i * p_i)        =   ( 0.5 * 0.0092222592222592232 )          + ( 0.5 * 0.0092222592222592232 )          + ( 0.5 * 0.0092222592222592232 )          + ( 0.5 * 0.0092222592222592232 )          + ( 0.5 * 0.0092222592222592232 )          + ( 0.5 * 0.0092222592222592232 )          + ( 0.5 * 0.041625041625041624 )          + ( 0.5 * 0.041625041625041624 )        = 0.069291819291819295 (hand-calculated)        = 0.069291819291819295 (calculated at once by NewickVector) ```

### 7#: (1,2,(1,1))

 ``` N = the phylogeny = (1,2,(1,1)) t = theta = 10 p(N,t) = probability = SUM(c_i * p_i) c(N,t) = coefficient D(N,t) = denominator = 40 N'1 = (3,(1,1)) N'2 = (1,1,(1,1)) N'3 = (1,2,2) N'4 = (1,2,2) For t = 10: p'1 = 0.0026640026640026644 (calculated at once with NewickVector) p'2 = 0.12950012950012951 (calculated at once with NewickVector) p'3 = 0.0041625041625041621 (calculated at once with NewickVector) p'4 = 0.0041625041625041621 (calculated at once with NewickVector) c'1 = t / d = 10 / 40 = 0.25 (as f equals 1) c'2 = (f*(f-1)) / d = 2 / 40 = 0.050000000000000003 (as f equals 2) c'3 = t / d = 10 / 40 = 0.25 (as f equals 1) c'4 = t / d = 10 / 40 = 0.25 (as f equals 1) p(N,t) = SUM(c_i * p_i)        =   ( 0.25 * 0.0026640026640026644 )          + ( 0.050000000000000003 * 0.12950012950012951 )          + ( 0.25 * 0.0041625041625041621 )          + ( 0.25 * 0.0041625041625041621 )        = 0.0092222592222592232 (hand-calculated)        = 0.0092222592222592232 (calculated at once by NewickVector) ```

### 8#: (1,(1,1,1,1))

 ``` N = the phylogeny = (1,(1,1,1,1)) t = theta = 10 p(N,t) = probability = SUM(c_i * p_i) c(N,t) = coefficient D(N,t) = denominator = 20 N'1 = (1,(2,1,1)) N'2 = (1,(1,2,1)) N'3 = (1,(1,1,2)) N'4 = (1,(2,1,1)) N'5 = (1,(1,2,1)) N'6 = (1,(1,1,2)) N'7 = (1,(1,2,1)) N'8 = (1,(2,1,1)) N'9 = (1,(1,1,2)) N'10 = (1,(1,1,2)) N'11 = (1,(1,2,1)) N'12 = (1,(2,1,1)) For t = 10: p'1 = 0.006919006919006921 (calculated at once with NewickVector) p'2 = 0.006919006919006921 (calculated at once with NewickVector) p'3 = 0.006919006919006921 (calculated at once with NewickVector) p'4 = 0.006919006919006921 (calculated at once with NewickVector) p'5 = 0.006919006919006921 (calculated at once with NewickVector) p'6 = 0.006919006919006921 (calculated at once with NewickVector) p'7 = 0.006919006919006921 (calculated at once with NewickVector) p'8 = 0.006919006919006921 (calculated at once with NewickVector) p'9 = 0.006919006919006921 (calculated at once with NewickVector) p'10 = 0.006919006919006921 (calculated at once with NewickVector) p'11 = 0.006919006919006921 (calculated at once with NewickVector) p'12 = 0.006919006919006921 (calculated at once with NewickVector) c'1 = t / d = 10 / 20 = 0.5 (as f equals 1) c'2 = t / d = 10 / 20 = 0.5 (as f equals 1) c'3 = t / d = 10 / 20 = 0.5 (as f equals 1) c'4 = t / d = 10 / 20 = 0.5 (as f equals 1) c'5 = t / d = 10 / 20 = 0.5 (as f equals 1) c'6 = t / d = 10 / 20 = 0.5 (as f equals 1) c'7 = t / d = 10 / 20 = 0.5 (as f equals 1) c'8 = t / d = 10 / 20 = 0.5 (as f equals 1) c'9 = t / d = 10 / 20 = 0.5 (as f equals 1) c'10 = t / d = 10 / 20 = 0.5 (as f equals 1) c'11 = t / d = 10 / 20 = 0.5 (as f equals 1) c'12 = t / d = 10 / 20 = 0.5 (as f equals 1) p(N,t) = SUM(c_i * p_i)        =   ( 0.5 * 0.006919006919006921 )          + ( 0.5 * 0.006919006919006921 )          + ( 0.5 * 0.006919006919006921 )          + ( 0.5 * 0.006919006919006921 )          + ( 0.5 * 0.006919006919006921 )          + ( 0.5 * 0.006919006919006921 )          + ( 0.5 * 0.006919006919006921 )          + ( 0.5 * 0.006919006919006921 )          + ( 0.5 * 0.006919006919006921 )          + ( 0.5 * 0.006919006919006921 )          + ( 0.5 * 0.006919006919006921 )          + ( 0.5 * 0.006919006919006921 )        = 0.041514041514041512 (hand-calculated)        = 0.041514041514041512 (calculated at once by NewickVector) ```

### 9#: (1,(1,1,2))

 ``` N = the phylogeny = (1,(1,1,2)) t = theta = 10 p(N,t) = probability = SUM(c_i * p_i) c(N,t) = coefficient D(N,t) = denominator = 40 N'1 = (1,(2,2)) N'2 = (1,(1,3)) N'3 = (1,(2,2)) N'4 = (1,(1,3)) N'5 = (1,(1,1,1)) For t = 10: p'1 = 0.0012950012950012951 (calculated at once with NewickVector) p'2 = 0.0028305028305028309 (calculated at once with NewickVector) p'3 = 0.0012950012950012951 (calculated at once with NewickVector) p'4 = 0.0028305028305028309 (calculated at once with NewickVector) p'5 = 0.097125097125097148 (calculated at once with NewickVector) c'1 = t / d = 10 / 40 = 0.25 (as f equals 1) c'2 = t / d = 10 / 40 = 0.25 (as f equals 1) c'3 = t / d = 10 / 40 = 0.25 (as f equals 1) c'4 = t / d = 10 / 40 = 0.25 (as f equals 1) c'5 = (f*(f-1)) / d = 2 / 40 = 0.050000000000000003 (as f equals 2) p(N,t) = SUM(c_i * p_i)        =   ( 0.25 * 0.0012950012950012951 )          + ( 0.25 * 0.0028305028305028309 )          + ( 0.25 * 0.0012950012950012951 )          + ( 0.25 * 0.0028305028305028309 )          + ( 0.050000000000000003 * 0.097125097125097148 )        = 0.006919006919006921 (hand-calculated)        = 0.006919006919006921 (calculated at once by NewickVector) ```

### #10: (1,(1,1,1,1,1))

 ``` N = the phylogeny = (1,(1,1,1,1,1)) t = theta = 10 p(N,t) = probability = SUM(c_i * p_i) c(N,t) = coefficient D(N,t) = denominator = 30 N'1 = (1,(2,1,1,1)) N'2 = (1,(1,2,1,1)) N'3 = (1,(1,1,2,1)) N'4 = (1,(1,1,1,2)) N'5 = (1,(2,1,1,1)) N'6 = (1,(1,2,1,1)) N'7 = (1,(1,1,2,1)) N'8 = (1,(1,1,1,2)) N'9 = (1,(1,2,1,1)) N'10 = (1,(2,1,1,1)) N'11 = (1,(1,1,2,1)) N'12 = (1,(1,1,1,2)) N'13 = (1,(1,1,2,1)) N'14 = (1,(1,2,1,1)) N'15 = (1,(2,1,1,1)) N'16 = (1,(1,1,1,2)) N'17 = (1,(1,1,1,2)) N'18 = (1,(1,1,2,1)) N'19 = (1,(1,2,1,1)) N'20 = (1,(2,1,1,1)) For t = 10: p'1 = 0.0027573616859331157 (calculated at once with NewickVector) p'2 = 0.0027573616859331157 (calculated at once with NewickVector) p'3 = 0.0027573616859331153 (calculated at once with NewickVector) p'4 = 0.0027573616859331148 (calculated at once with NewickVector) p'5 = 0.0027573616859331157 (calculated at once with NewickVector) p'6 = 0.0027573616859331157 (calculated at once with NewickVector) p'7 = 0.0027573616859331153 (calculated at once with NewickVector) p'8 = 0.0027573616859331148 (calculated at once with NewickVector) p'9 = 0.0027573616859331157 (calculated at once with NewickVector) p'10 = 0.0027573616859331157 (calculated at once with NewickVector) p'11 = 0.0027573616859331153 (calculated at once with NewickVector) p'12 = 0.0027573616859331148 (calculated at once with NewickVector) p'13 = 0.0027573616859331153 (calculated at once with NewickVector) p'14 = 0.0027573616859331157 (calculated at once with NewickVector) p'15 = 0.0027573616859331157 (calculated at once with NewickVector) p'16 = 0.0027573616859331148 (calculated at once with NewickVector) p'17 = 0.0027573616859331148 (calculated at once with NewickVector) p'18 = 0.0027573616859331153 (calculated at once with NewickVector) p'19 = 0.0027573616859331157 (calculated at once with NewickVector) p'20 = 0.0027573616859331157 (calculated at once with NewickVector) c'1 = t / d = 10 / 30 = 0.33333333333333331 (as f equals 1) c'2 = t / d = 10 / 30 = 0.33333333333333331 (as f equals 1) c'3 = t / d = 10 / 30 = 0.33333333333333331 (as f equals 1) c'4 = t / d = 10 / 30 = 0.33333333333333331 (as f equals 1) c'5 = t / d = 10 / 30 = 0.33333333333333331 (as f equals 1) c'6 = t / d = 10 / 30 = 0.33333333333333331 (as f equals 1) c'7 = t / d = 10 / 30 = 0.33333333333333331 (as f equals 1) c'8 = t / d = 10 / 30 = 0.33333333333333331 (as f equals 1) c'9 = t / d = 10 / 30 = 0.33333333333333331 (as f equals 1) c'10 = t / d = 10 / 30 = 0.33333333333333331 (as f equals 1) c'11 = t / d = 10 / 30 = 0.33333333333333331 (as f equals 1) c'12 = t / d = 10 / 30 = 0.33333333333333331 (as f equals 1) c'13 = t / d = 10 / 30 = 0.33333333333333331 (as f equals 1) c'14 = t / d = 10 / 30 = 0.33333333333333331 (as f equals 1) c'15 = t / d = 10 / 30 = 0.33333333333333331 (as f equals 1) c'16 = t / d = 10 / 30 = 0.33333333333333331 (as f equals 1) c'17 = t / d = 10 / 30 = 0.33333333333333331 (as f equals 1) c'18 = t / d = 10 / 30 = 0.33333333333333331 (as f equals 1) c'19 = t / d = 10 / 30 = 0.33333333333333331 (as f equals 1) c'20 = t / d = 10 / 30 = 0.33333333333333331 (as f equals 1) p(N,t) = SUM(c_i * p_i)        =   ( 0.33333333333333331 * 0.0027573616859331157 )          + ( 0.33333333333333331 * 0.0027573616859331157 )          + ( 0.33333333333333331 * 0.0027573616859331153 )          + ( 0.33333333333333331 * 0.0027573616859331148 )          + ( 0.33333333333333331 * 0.0027573616859331157 )          + ( 0.33333333333333331 * 0.0027573616859331157 )          + ( 0.33333333333333331 * 0.0027573616859331153 )          + ( 0.33333333333333331 * 0.0027573616859331148 )          + ( 0.33333333333333331 * 0.0027573616859331157 )          + ( 0.33333333333333331 * 0.0027573616859331157 )          + ( 0.33333333333333331 * 0.0027573616859331153 )          + ( 0.33333333333333331 * 0.0027573616859331148 )          + ( 0.33333333333333331 * 0.0027573616859331153 )          + ( 0.33333333333333331 * 0.0027573616859331157 )          + ( 0.33333333333333331 * 0.0027573616859331157 )          + ( 0.33333333333333331 * 0.0027573616859331148 )          + ( 0.33333333333333331 * 0.0027573616859331148 )          + ( 0.33333333333333331 * 0.0027573616859331153 )          + ( 0.33333333333333331 * 0.0027573616859331157 )          + ( 0.33333333333333331 * 0.0027573616859331157 )        = 0.018382411239554104 (hand-calculated)        = 0.018382411239554104 (calculated at once by NewickVector) ```