DEVELOPMENT OF EST-SSR MARKERS TO ASSESS GENETIC DIVERSITY IN ELETTARIA CARDAMOMUM MATON

Elettaria cardamomum Maton is one of the most ancient and valuable spice crops. Cardamom is cultivated following intensive pesticide usage where alleles present in the wild cardamom genotypes could positively contribute towards genetic improvement of the cultivars. However, the genetic map or whole-genome sequence of E. cardamomum is not available and very limited information on simple sequence repeat (SSR) markers are publicly available. We have tested whether SSRs from Curcuma longa can be used to analyze genetic diversity E. cardamomum.


Introduction
Elettaria cardamomum Maton also known as the 'Queen of spices' is one of the most ancient and valuable spice crops. It is the third most expensive spice in the world after saffron and vanilla. Cardamom belongs to Zingiberaceae family which is one of the largest families of monocotyledons, comprising of about 52 genera and 1500 species. Cardamom is believed to be originated in the moist evergreen forest of Western Ghats of Southern India and many wild populations are still confined to this region (Ravindran and Madhusoodanan, 2002). Ecosystem diversity is very limited in cardamom and majority of the diversity in cardamom comes from varietal diversity (Madhusoodanan et al., 1994).
The study of genetic diversity and interspecific or intergeneric relationships among a number of species are now being extensively carried out with the help of molecular markers (Bandopadhyay et al., 2004). SSRs can be classified into genomic SSRs and EST-SSRs based on the original sequences used to identify microsatellite region (Wei et al., 2011). In the past, development of SSR markers following conventional approach of creating genomic library, hybridisation with tandemly repeated oligonucleotides and sequencing of clones have been very expensive and time consuming (Scott et al.,2000). Primers used to amplify SSR region in this procedure are species specific so that markers developed in one taxon cannot be transferred to another (Ellis and Burke 2007). EST-derived SSR markers show good transferability across taxonomic boundaries and can be used as anchor markers for comparative mapping and evolutionary studies. In the present study, 20 EST-SSR primer pairs were custom synthesized from 206 set of EST-SSR primers developed from ESTs of C. longa, which also belongs to the Zingiberaceae family.
Retrieved sequences were then uploaded in TRIMEST program of EMBOSS suite (genome.csdb.cn/cgibin/emboss/trimest) to remove the poly-A and poly-T stretches of the ESTs corresponding to the poly-A tails of eukaryotic mRNA (Kumpatla and Mukhopadhyay, 2005). The sequences were then assembled using contig assembly program CAP3 in the mobyle portal platform (mobyle.pasteur.fr/cgi-bin/portal.py).
Totally, 5050 potential unique ESTs including 3051 contigs and 1999 singletons were generated and 290 SSR regions were identified from this non-redundant dataset using the Research Article software WEBSAT (wsmartins.net/websat/). Among the 290 SSRs identified, 84 could not be used to design the primers as the flanking sequences were too short. Hence, 206 primer pairs were designed using WEBSAT. The quality of the designed primers was checked using NETPRIMER (premierbiosoft.com/netprimer/index.html). A selected set of 20 primers were synthesized at Integrated DNA Technologies (IDT), New Delhi.

Validation of EST-SSR primers in E. cardamomum
The transferability of 20 EST-SSR primer pairs developed from ESTs of curcuma were tested cardamom. All the cardamom accessions including wild collections (W), landraces (L) and released varieties (RV) were obtained from the cardamom germplasm conservatory of JNTBGRI. They were originally collected from different parts of Kerala and germplasm collection available at Indian Cardamom Research Institute (ICRI) and Cardamom Research Station (CRS) in Idukki. Total genomic DNA was isolated from young leaves using DNeasy plant mini kit (Qiagen, USA). The DNA concentration and quality were analyzed using Biophotometer (Eppendorf India Ltd) and agarose gel electrophoresis (3%) at 70V. The genomic DNA was diluted to a concentration of 50 ng/μl for PCR amplification.
Polymerase chain reaction (PCR) was performed in the selected accessions in a 25 μl reaction mixture that contained 50 ng template DNA, 1× PCR buffer, 200 μM of each of the four dNTPs, 15pm of each of the forward and reverse primers and one unit of Taq DNA polymerase. The following PCR conditions were used: 94°C for 2 mins, followed by 35 cycles of 94°C for 30 sec, specific annealing temperature for 1 min, 72°C for 2 mins, and 7 mins at 72°C for the final extension. Amplification was carried out on Agilent Technologies thermal cycler (Agilent Technologies, Malaysia). The amplified products were resolved in 3% agarose gel at 70 V for 3 hrs and visualised using ethidium bromide in gel documentation system (UVP, UK).

Analysis of genetic diversity
The bands obtained with each primer were scored keeping the expected size range of the PCR products as reference. Basic statistics including the number of observations (No) for a marker locus calculated based the number of nonmissing genotypes observed in a sample, allele frequency (Af), heterozygosity (H) as the proportion of heterozygous individuals in the population, gene diversity (D) as the probability that two randomly chosen alleles from the population are different and polymorphism information content (PIC) (Bostein et al., 1980) as an estimate of allelic variation which range from 0 to 1 where 0 indicates no allelic variation (Hildebrand et al., 1992) were obtained using PowerMarker Ver3.25 (Liu and Muse, 2005). Nei's genetic distance (Schneiderbauer et al., 1991) was computed for each pair of the accessions. The UPGMA dendrograms based on genetic distance values were constructed using TreeView V1.6.6 (page 1996).

Type and frequency of Curcuma longa EST-SSRs
A total of 12,678 ESTs were used to evaluate the presence of SSR motifs. To eliminate redundant sequences, the CAP3 contig assembly program was used to obtain consensus sequences from overlapping clusters of ESTs. Totally, 5050 potential unique ESTs including 3051 contigs and 1999 singletons were generated. A total of 290 SSRs were identified from 270 unique ESTs. Of those, 18 (about 6.67%) ESTs contained more than one SSR. The occurrences of different repeat units were tri-(52.1%), di-(44.1%), tetra-(3.1%), penta-(0.3%) and hexa-nucleotide (0.3%), of which repeat motifs TA and CT were the most abundant.

Identification and characteristics of EST-SSRs in E. cardamomum
Among the 290 SSRs identified, 84 could not be used to design the primers as the flanking sequences were too short. Therefore only 206 primer pairs were designed for the SSRs and 20 were custom synthesized to find out crossamplification in cardamom. The transferability of 20 EST-SSR primer pairs developed from ESTs of Curcuma longa were tested in Eletteria cardamomum. For each of the 20 EST-SSR primers in E. cardamomum, the name, sequence of the forward and reverse primers, the repeat type, annealing temperature and expected size of the PCR products are listed in Table 1. Analysis of the nucleotide sequences of the EST-SSRs showed that, in E. cardamomum, the EST-SSRs corresponded to 50% trinucleotide repeats, 37.5% dinucleotide repeats and 12.5% hexanucleotide repeats. Among these primers, 12 EST-SSR primer pairs produced amplicons in the 18 cardamom accessions.The failure of eight primer pairs to produce amplicons might be possibly due to the location of the primers across splice sites, large introns, chimeric primer(s), or poor-quality sequences (Varshney et al., 2005).

Genetic variation
The allele frequency (for the major allele) was 0.82 (which ranged from 0.50 to 1.00), number of alleles per loci was 1.62, gene diversity was 0.23 (ranged from 0 to 0.50), heterozygosity was 0.31 (ranged from 0 to 1.00), PIC was 0.18 and inbreeding coefficient was -0.34 (ranged from -1.00 to 0.81). Perusal of the marker data showed that four microsatellite loci were monomorphic in all the accessions (with few missing bands in some of the accessions). These loci were not included in the preceding variability analysis. For the microsatellite loci (at accession level), allele frequency ranged from 0.44 (ICRI-5) to 0.81 (wild), number of alleles ranged from 0.88 (ICRI-1) to 1.75 (wild), and PIC from 0.14 (wild) to 0.48 (ICRI-5 and ICRI-7).

Phylogenetic analysis
The pair-wise genetic distance matrix (Schneiderbauer et al., 1991) was computed on the basis of SSR data. The genetic distance values ranged from 0 to 0.45 with a mean value of 0.15. Cluster analysis was performed on the SSR data following the UPGMA (unweighted pair-group method using arithmetic average) method and the dendrograms were constructed through TREEVIEW (Page 1996) showing overall genetic relatedness among the individuals (Fig. 1). The accessions studied were clustered into two main groups, one with 13 and other with the remaining 5 accessions. The first group has 2 subgroups. The first subgroup consists of 4 accessions of which 2 (C24 and C53) are either wild or represents abandoned cultivars where 2 others (C55 and C63) are landraces currently popular among the farmers. The second subgroup is further divided into 2 groups; the first one of which contains 4 accessions where C60 and C56 are landraces, C61 is a released variety and C70 is may be wild or an escape from a plantation which was actually collected from a forest land whereas the second one consists of 5 accessions of which 3 are released varieties and the remaining 2 are landraces. Of the remaining 5 accessions in the second major group, two are released varieties and three are wild accessions. The dendrogram revealed a complex distribution pattern which is in agreement with a previous report (Ashitha et al., 2013) and indicates diverse nature of the accessions.
The study revealed occurrence of different repeat units were tri-(52.1%), di-(44.1%), tetra-(3.1%), penta-(0.3%) and hexa-nucleotide (0.3%), of which repeat motifs TA and CT were the most abundant. The SSR analysis using 12 microsatellite loci generated altogether 211 alleles and based on which, an assessment of genetic diversity was carried out. Eight of these markers revealed low (0.14) to moderate (0.48) polymorphism information content (PIC) across 18 genotypes of E. cardamomum, while 4 markers were found to be monomorphic.
This study gives an insight into the frequency, type and distribution of curcuma EST-SSRs and demonstrates successful development as well as utility of EST-SSR markers in cardamom. These EST-SSR markers could contribute to the knowledge and current resource of molecular markers and these markers would be useful for quantitative trait mapping, and marker-assisted selection besides genetic diversity and phylogenetic studies in cardamom.