Files, received from a Sanger sequencing facility, may have different file names, but the same sequence names.
The current UGENE version doesn't allow to distinguish the reads in this case.
Add a new parameter to the "Map Sanger Reads to Reference" dialog and to the "Map to Reference" workflow element. The name of the parameter is "Read name in result alignment", put a combo box nearby with values "Sequence name from file" (the default value) and "File name". Description of the parameter in the Property Editor should be the following:
Reads in the result alignment can be named either by names of the sequences in the input files or by the input files names. For example, if the sequences have the same name, set this value to "File name" to be able to distinguish the reads in the result alignment.
In the dialog add the parameter below the "Mapping min identity" parameter in the "Settings" group.
In the Workflow Designer put it under the "Minimum read identity" parameter. Note that additional modification of the "Sequence Quality Trimmer" element may be required (e.g. addition of slots).
Also, in the Workflow Designer modify the wizard of the Sanger sample as follows:
- Rename the wizard dialog from "Trim and Align Sanger Reads" to "Map Sanger Reads to Reference".
- Rename the "Trimming and Filtering" page to "Mapping Settings".
- Rename "Quality threshold" to "Trimming quality threshold" (both in the wizard and in the Property Editor).
- Remove parameter "Min length" from the wizard.
- Rename "Min identity" to "Mapping min similarity" (in the Property Editor), add to the wizard.
- Add the new "Read name in result alignment" parameter to the same group in the wizard.
Please also rename "Mapping min identity" parameter in the "Map Sanger Reads to Reference" dialog to "Mapping min similarity" in terms of the issue.
- relates to
UGENE-5863 Wrong identity filtering in Sanger
UGENE-6000 "Read name in result alignment" for Sanger de novo assembly