Scenario:
- Make sure:
- taxonomy data are available
- "viral_database" or "bacterial_viral_database" is available
- RefSeq data are NOT available
- Open a valid workflow with CLARK-classify.
- Run the workflow with the default parameters.
Expected state: the workflow ran successfully. - Change some CLARK algorithmic parameters, e.g. "Minimum k-mer frequency", "Gap", etc.
- Run the workflow again.
Current state: CLARK attempts to build a database with new parameters. As RefSeq data are not available, the database does not contain any meaningful data. There are a bunch of errors in the log like:Failed to open ../../refseq/viral/AC_000001.1.fna Failed to open ../../refseq/viral/AC_000002.1.fna Failed to open ../../refseq/viral/AC_000003.1.fna ...
But this is not reflected somehow on the WD dashboard.
Expected state: the workflow should be considered as invalid in this case. An error should be shown on the workflow validation:Reference database for these CLARK settings is not available. RefSeq data are required to build it.
- Put the RefSeq data to the appropriate location.
- Run the workflow again.
Expected state: the new database is build. There are no errors in the log.
- relates to
-
UGENE-6191 General scenario of CLARK validation, when reference data are not available
- Closed