Link Search Menu Expand Document

Frequently asked questions

General

What is NP Analyst?

NP Analyst is a data integration pipeline for predicting biological activities of natural products directly from complex mixtures.

How do I get started?

You need both bioassay results and mass spectrometry data on your set of samples. Head over to the Quickstart Guide for some recommendations on data types and formats, and instructions on how to set up the analysis.

How much does it cost to use NP Analyst?

Nothing! NP Analyst is open source and free to use. If you would like the source code you can find it at github.com/liningtonlab/npanalyst. The local command line interface version is also available at this site.

Do I need to register to use NP Analyst?

No. The system is free and open. Providing an email is an optional step that allows us to send you a link to your results files.

How it works

How does NP Analyst predict bioactive features?

NP Analyst determines the strength of the fingerprint (Activity Score) and the consistency of that fingerprint (Cluster Score) and then filters the MS data to retain only those MS features that meet minimum cutoffs in both metrics.

Do I need to run a specific bioassay?

A: No. NP Analyst works with any biological data. We recommend multiple biological readouts, and data normalization for optimal results. There is lots of information on these topics in the File import of the documentation.

Can I use MS data from any instrument?

Probably. If you can convert it to mzML or process it in MZmine or GNPS then you can use it in NP Analyst. There are probably some exotic experiments that are not compatible with these platforms, but they should be rare.

What is Activity Score? How should I select my score cutoff?

Activity score is the sum of the squares of the averages for each bioassay column for all the samples that an MS feature is in. In other words, Activity Score is a measure of the strength of the biological phenotype. Only MS features that are active in many columns in the bioassay table and have consistent activity among the samples will have high Activity Scores. If the data are normalized then the maximum value is the same as the number of columns. Typically, 10% of the maximum value is a reasonable starting point for Activity Score cutoff.

What is Cluster Score? How should I select my score cutoff?

Cluster score is the consistency of the biological fingerprint associated with each MS feature. It is calculated by averaging the Pearson similarity scores between the fingerprints of all pairs of samples that a given MS feature is present in. Typically a low positive value (e.g. 0.1) is a reasonable starting point for Cluster Score cutoff.

Data visualization

Is all my data in the network view?

No. Samples that do not contain any MS features above the selected Activity and Cluster Score cutoffs are removed.

How do I use the scatter plot visualization?

Hover over data points to get information on each one, or use the filter panes below to select subsets of the data. Optionally, hit ‘Select All’ and display data from all the features in the table view at the bottom of the page. This filtered table can be exported using the ‘Export’ button.

What is displayed in the community view page?

See the Communities section of the documentation. This is the best place to review data on individual communities, and identify MS features for further analysis.

How do I find samples of interest in the results?

Either filter for those samples in the Scatter Plot view, or use the Network view to find the community number for your sample and then review the community level data on the community page.

Data export and sharing

Can I export my results files?

Yes. Table, graph, and community versions of the full results file are available in the Downloads page.

What is the job number for?

The job number is the unique code that gives you access to your results. You can’t search for it again once the job is running, so make sure you save it somewhere safe or, better still, enter an email so we can send you the link.

Can I share the job number with other collaborators?

Sure. Anyone you send the link to will be able to access your results, making it easy to share results between collaborators.

Data security and availability

Do you keep the data I upload?

No. We delete the original bioassay and MS data once the analysis is complete. However, we do keep the results files available for you to view for 6 months, which contain some of the data from the original files.

Are my results private?

Officially, no. Anyone with the link can access your results files. However, the job numbers are complex codes that are hard to guess and we do not post the links to results files on the site. This means that it is unlikely that outsiders will be able to access your data. If you require a secure version you can run NP Analyst on your local machine using the command line interface. Head over to our GitHub repository () github.com/liningtonlab/npanalyst for more information on this.

How long are my results available online?

6 months. After that we delete them to clear space on the server.