Searching vs Browsing/Filtering Data at ZFIN

People search the ZFIN database every day, most frequently by gene symbol. From there, the vast majority of searchers go to the gene page for the gene they searched for. From the gene page, all the information in the ZFIN database about that gene is available. This is a very direct and powerful way to get to the gene record you want to see in ZFIN. However, this is also a narrow view of the potentially insightful related data that may surround that gene record. Often, a data browsing/filtering approach can be more powerful for open-ended or complex queries. By making that cloud of related data more visible, browsing and filtering may also reveal relationships in data that may not have been anticipated. ZFIN supports data browsing/filtering using the "single box search" found on the ZFIN home page and at the top right of all other ZFIN web pages (Figure 1).

Figure 1. Access the ZFIN single box search from the home page or the top right of other pages in ZFIN.

Let's look at some examples.

  1. Which genes are most commonly found in publications discussing dlx2a?

    1. From the ZFIN home page, search for "dlx2a" in the single box search.

    2. Select Publication from the list of categories at the left side of the search results page to get the publication search results.

    3. The left side of the Publication search results page has a collection of filters which can be used to narrow the result set, including "Gene", "Mutation/Tg", etc. Each time one of the filter values is clicked, the publication result set is limited to only publications which have that attribute. For example, if the dlx2a gene in the "Gene" filter is clicked, all the publications will then be required to include the dlx2a gene. Today, 384 publications meeting that criterion (Figure 2).

    4. Here is where the magic is. Not only do the filters on the left side allow you to reduce the publication result set further based on specific attributes, they also contain information of their own. For example, you can immediately see at the top center of the search results that there are 384 publications that include dlx2a. The filters at the left show that of the 384 dlx2a publications, 107 also include shha, and 96 include foxd3. Thus, the answer to our question "Which genes are most commonly found in publications discussing dlx2a?" is shha, foxd3, egr2b, and sox10.

    5. If you want to look further, click the "Show All" link at the bottom of each filter. There you can look at a larger list of potential filter values and search those values for specific items of interest. Try this on your own: How many, and which, journal publications are about dlx2a and dlx1a? (hint: use the "Publication Type" filter!)



      Figure 2. Single box search for publications that include the dlx2a gene

  2. Which genes are associated with a retinal phenotype?

    1. Click in the single box search text entry box on the ZFIN home page or top right of all other pages in ZFIN and hit Enter or click the "Go" button. You now have access to all the data in ZFIN in the single box search interface.

    2. Click the "Gene/Transcript" category in the left column. Now, only the genes and transcripts in ZFIN are in the search results and the "Gene/Transcript" filters are available in the left side of the page. We want to know which genes are associated with a retina phenotype, so we are looking for genes, not transcripts.

    3. In the "Type" filter, select "Gene". This will limit the result set to just records of type "Gene". Click the "Type" filter label to collapse that filter since it is now set up as desired.

    4. In the collection of "Phenotype" filters, locate the "Affected Anatomy" filter. This lists any anatomical structure that has been annotated as being affected in a phenotype associated with any gene in the result set.  We are looking for genes associated with a retina phenotype.

    5. Click "Show All" under "Affected Anatomy" to access the full list of affected anatomical structures. Filter that list to find "retina".

    6. Select "retina" from the Affected Anatomy filter. Today, that limits the full set of genes to 520 genes which have a retina phenotype (Figure 3).

    7. That list of 520 genes answers our question! These 520 genes are associated with a phenotype affecting the retina.

    8. These can then be further filtered if desired using the filters at the left, or they can then be downloaded by clicking on the "Download" button at the top of the search results.



      Figure 3. Results of filtering to locate genes having a retinal phenotype.

  3. What other phenotypes are associated with genes producing retinal phenotypes?

      1. Starting from the list of 520 genes with a retina phenotype identified in the previous example, what other phenotypes are associated with those genes?

      2. Clicking the "Phenotype Statement" filter in the "Phenotypes" filter collection in the left column will show the most common phenotypes associated with this set of 520 genes having a retina phenotype, and how many genes in the result set are associated with that additional phenotype.

      3. Today the filter shows that of the 520 "retina phenotype genes" 262 are associated with decreased eye size, 106 are associated with decreased head size, and 102 are associated with an edematous pericardium (Figure 4).




        Figure 4. Genes associated with a retinal phenotype are also associated with reduced eye size, decreased head size, and an edematous pericardium as seen in the "Phenotype Statement" filter.

      4. Clicking on "Show All" at the bottom of the "Phenotype Statement" filter provides access to all the phenotypes associated with this set of genes and the ability to search over those to explore the data further.

      5. For example, are any of these 520 genes which have a retina phenotype also associated with the absence of Meckel's cartilage?

        1. Click "Show All" under "Phenotype Statement"

        2. In the "Filter" box of the popup type "Meckel"

        3. The phenotype statement set is filtered to only those involving Meckel's cartilage.

        4. Today, five genes are associated with absence of Meckel's cartilage in addition to having the retina phenotype we originally filtered for (Figure 5).

        5. Clicking on "Meckel's cartilage absent, abnormal" will filter the gene results down to just those five genes associated with both a retina phenotype and an absence of Meckel's cartilage (aldh1a2, chd7, alx1, med14, and nup107) (Figure 6)

          Figure 5. Filtering the "Phenotype Statement" values for phenotypes involving Meckel's cartilage



          Figure 6. Five genes are associated with both a retinal phenotype and the absence of Meckel's cartilage.


  4. Does the Zebrafish International Resource Center (ZIRC) have any point mutations available for sale which create a premature stop resulting in phenotypes including decreased size of the eye and inner ear?


    1. Go to the ZFIN home page and click in the text entry area for single box search and hit enter or click the "Go" button to access all data in the search interface.
    2. We are looking for mutants, so in the list of Categories at the left, select "Mutation/Tg".
    3. The result set now includes all the mutation and Tg records in ZFIN (currently 68,824). Now, filter that long list down to find just those mutants that meet our criteria.
    4. First, the mutant needs to be available from ZIRC, so click to open the "Source" filter, and select "Zebrafish International Resource Center (ZIRC)". Currently that filters the set down to 43,865 mutants/transgenics.
    5. Next, the desired mutant must be a point mutation. In the "Type" filter click "Point Mutation". That limits the set further to only include point mutations (31,566 currently)
    6. Then, the desired mutant should result in a premature stop. Click to open the "Consequence" filter and click "premature stop". That filters the set further to only include those features which have a premature stop as their consequence (20,181 currently).
    7. Finally, the desired mutant should have decreased eye and inner ear size. Click to open the 'Phenotype Statement" filter and click the "eye decreased size, abnormal". Today, that limits the resulting list of mutants to 25 candidates.
    8. Click on "inner ear decreased size, abnormal" in the "Phenotype Statement" filter. Today that filters the set of mutants down to just five mutants (jj410, sa964, sa1349, sa1376, and sa3219)  (Figure 7).
    9. Clicking "Show All" in the "Affected Genomic Region" filter shows that these mutants are associated with the genes cars, lmx1bb, mdn1, polr1a, and tfip11.
    10. The "Phenotype Statement" Filter shows that two of these mutants are associated with an abnormal gut phenotype, and three are associated with an edematous pericardium phenotype.



    Figure 7. Filtering to locate mutants which are available from ZIRC and are point mutations creating a premature stop and resulting in a phenotype including decreased eye and inner ear size.

    Conclusion:
    The above examples only scratch the surface of the functionality provided in the single box search filtering interface. Filtering data in the single box search limits the result set to only records which have the attributes you select from the filters. It is also possible to exclude results having one of the filter values by clicking on the small minus sign that appears beside each filter value when your mouse passes over it. The small plus sign that appears has the same effect as clicking on the term itself. This filtering approach shows you where the data are and how many records are available. The only values offered in the filters are values which will produce at least one record in the results. This eliminates blindly searching and getting no search results. Filter values are also marked with the number of search results that will be retained if you select that filter value, providing insight into the composition of your search result set in a way that direct searching cannot. Single box search currently supports searching and filtering for genes, transcripts, expression, phenotype, human disease, Fish, reporter lines, mutation/Tgs, constructs, morpholinos, TALENs, CRISPRs, antibodies, markers/clones, published figures, anatomy and GO terms, publications, and people, labs, and companies...or...you can search across all these categories at once.  Come up with your own questions and see if single box search can find the answers! As always, we welcome your feedback on single box searching and filtering, or any other aspect of ZFIN, at zfinadmn [at] zfin.org.