-
Type: New Feature
-
Status: Closed
-
Priority: Critical
-
Resolution: Fixed
-
Affects Version/s: virogenesis
-
Fix Version/s: 1.31
-
Labels:
-
Story Points:8
-
Epic Link:
-
Sprint:DEV-30-4
-
Affect Type:Userdefined
Element name and description
- Name of the element: "Improve Classification with WEVOTE"
- Description of the element on the Scene: "Ensemble classification data, produced by other tools."
- Description of the element in the Property Editor:
"WEVOTE (WEighted VOting Taxonomic idEntification) is a metagenome shortgun sequencing DNA reads classifier based on an ensemble of other classification methods (Kraken, CLARK, etc.)."
Input data
There is one input port:
Item | Value |
---|---|
Port name in GUI | Input classification CSV file |
Port description | Input a CSV file in the following format: 1) a sequence name 2) taxID from the first tool 3) taxID from the second tool 4) etc. |
Port ID in UWL | in |
Number of slots | 1 |
Slot #1 name in GUI | Input URL |
Slot #1 ID in UWL | url |
Slot #1 data type | String |
Output data
There is one output port:
Item | Value |
---|---|
Port name in GUI | WEVOTE-classified sequences |
Port description | A map of sequence names with the associated taxonomy IDs. |
Port ID in UWL | out |
Number of slots | 1 |
Slot #1 name in GUI | Taxonomy classification data |
Slot #1 ID in UWL | tax-data |
Slot #1 data type | tax-classification |
Parameters
# | Parameter | Description | Value in GUI | Default value |
---|---|---|---|---|
1 | Penalty | Score penalty for disagreements (-k) | A spin box with integer values. | 2 |
2 | Number of agreed tools | Specify the minimum number of tools agreed on WEVOTE decision (-a). | A spin box with 32-bit integer values >= 0. | 0 |
3 | Score threshold | Score threshold (-s) | A spin box with 32-bit integer values >= 0. | 0 |
4 | Number of threads | Use multiple threads (-n). | A spin box with values from 1 to the number of available cores. | Use the value from the Application Settings (the "Optimize for CPU count" option). |
5 | Output file | Specify the output text file name. | A line edit with the browse button. The value is mandatory ("Required"). | Auto (this equals to "input_file_name_WEVOTE_Details.txt" |
Data processing by the element
- The element takes a CSV file as input like the output file from the "Ensemble Classification Data" workflow element (see
UGENE-6035). - Use the common taxonomy data that goes with the framework.
- Launch the WEVOTE executable with the specified parameters.
- Rename the output file to the specified name. By default the name is generated from the input file name, for example "HC1_WEVOTE_Details.txt"
- This file should appear on the WD dashboard.
- Parse the last column of the output file and create a new classification data map. Send it to the output port of the element.
Sample data
See, for example, "HC1_ensemble.csv" and "HC1_WEVOTE_Details.txt" files on the file server (in the ".../virogenesis/tools_testing/wevote_without_classifiers" folder). The second file was produced from the first one by running:
./WEVOTE -i HC1_ensemble.csv -d ./taxonomy -p HC1 -n 4
- relates to
-
UGENE-6035 Add "Ensemble Classification Data" workflow element
- Closed