Skip to content
This repository has been archived by the owner on Apr 26, 2023. It is now read-only.

'relion - particle extracting' crashes with 1M particles #2058

Open
JuhaHuiskonen opened this issue Oct 22, 2019 · 10 comments
Open

'relion - particle extracting' crashes with 1M particles #2058

JuhaHuiskonen opened this issue Oct 22, 2019 · 10 comments
Assignees

Comments

@JuhaHuiskonen
Copy link
Collaborator

When extracting over 1M particles using 'relion - particle extracting' I get the following error:

03487: Sqlite query: INSERT INTO MDTable_3( "rlnCoordinateX", "rlnCoordinateY", "rlnImageName", "rlnMicrographName", "rlnMagnification", "rlnVoltage", "rlnDefocusU", "rlnDefocusV", "rlnDefocusAngle", "rlnSphericalAberration", "rlnBfactor", "rlnCtfScalefactor", "rlnPhaseShift", "rlnAmplitudeContrast", "rlnOriginX", "rlnOriginY", "rlnDetectorPixelSize") SELECT "rlnCoordinateX", "rlnCoordinateY", "rlnImageName", "rlnMicrographName", "rlnMagnification", "rlnVoltage", "rlnDefocusU", "rlnDefocusV", "rlnDefocusAngle", "rlnSphericalAberration", "rlnBfactor", "rlnCtfScalefactor", "rlnPhaseShift", "rlnAmplitudeContrast", "rlnOriginX", "rlnOriginY", "rlnDetectorPixelSize" FROM MDTable_2

If I make a subset of just 200 coordinates, the protocol finishes fine. Is there a maximum limit of particles Scipion can handle?

@pconesa
Copy link
Member

pconesa commented Oct 23, 2019

Hi, @JuhaHuiskonen. No 200 K (I guess you missed K) is not the maximun limit. I've seen project using more than that. It is true that getting over 500K things get very slow and it might become annoying.

This issue must be something else. Could you please post more log lines.?

@JuhaHuiskonen
Copy link
Collaborator Author

I used just 200 (not 200K) to check that the project itself and the inputs were fine. I can try with more to see where it fails.

Here's more log lines from the failed run with 1M particles:

03433: srun which relion_preprocess_mpi --i micrographs_00001-03460.star --part_star micrographs_00001-03460_particles.star --coord_dir "." --coord_suffix .coords.star --part_dir "." --extract --extract_size 400 --set_angpix 4.240000 --bg_radius 47 --invert_contrast --norm --scale 100 --white_dust 5.000 --black_dust 5.000
03434: === RELION MPI setup ===
03435: + Number of MPI processes = 20
03436: + Master (0) runs on host = r05c20.bullx
03437: + Slave 1 runs on host = r05c20.bullx
03438: + Slave 2 runs on host = r05c20.bullx
03439: + Slave 3 runs on host = r05c20.bullx
03440: + Slave 4 runs on host = r05c20.bullx
03441: + Slave 5 runs on host = r05c20.bullx
03442: + Slave 6 runs on host = r05c20.bullx
03443: + Slave 7 runs on host = r05c20.bullx
03444: + Slave 8 runs on host = r05c20.bullx
03445: + Slave 9 runs on host = r05c20.bullx
03446: + Slave 10 runs on host = r05c20.bullx
03447: + Slave 11 runs on host = r05c20.bullx
03448: + Slave 12 runs on host = r05c20.bullx
03449: + Slave 13 runs on host = r05c20.bullx
03450: + Slave 14 runs on host = r05c20.bullx
03451: + Slave 15 runs on host = r05c20.bullx
03452: + Slave 16 runs on host = r05c20.bullx
03453: + Slave 17 runs on host = r05c20.bullx
03454: + Slave 18 runs on host = r05c20.bullx
03455: + Slave 19 runs on host = r05c20.bullx
03456: =================
03457: + Setting pixel size in output STAR file to 4.24 Angstroms
03458: WARNING: You manually changed the pixel size by the --set_angpix option. You can no longer use Bayesian Polishing on the resulting particles.
03459: Extracting particles from the micrographs ...
03460: 12.68/12.68 min ............................................................~~(,_,">
03461: Joining metadata of all particles from 3351 micrographs in one STAR file...
03462: Written out STAR file with 1057682 particles in micrographs_00001-03460_particles.star
03463: The new pixel size of the extracted particles are 16.96 Angstrom/pixel.
03464: Done preprocessing!
03465: FINISHED: extractMicrographListStep, step 1
03466: 2019-10-22 16:12:28.089902
03467: Traceback (most recent call last):
03468: File "/projappl/project_2001566/apps/scipion/2.0/pyworkflow/protocol/protocol.py", line 186, in run
03469: self._run()
03470: File "/projappl/project_2001566/apps/scipion/2.0/pyworkflow/protocol/protocol.py", line 1289, in _run
03471: self._runSteps(startIndex)
03472: File "/projappl/project_2001566/apps/scipion/2.0/pyworkflow/protocol/protocol.py", line 1161, in _runSteps
03473: self._stepsCheckSecs)
03474: File "/projappl/project_2001566/apps/scipion/2.0/pyworkflow/protocol/executor.py", line 133, in runSteps
03475: stepsCheckCallback()
03476: File "/projappl/project_2001566/apps/scipion/2.0/pyworkflow/em/protocol/protocol_particles.py", line 320, in _stepsCheck
03477: self._checkNewOutput()
03478: File "/projappl/project_2001566/apps/scipion/2.0/pyworkflow/em/protocol/protocol_particles.py", line 527, in _checkNewOutput
03479: self._updateOutputPartSet(newDone, streamMode)
03480: File "/projappl/project_2001566/apps/scipion/2.0/pyworkflow/em/protocol/protocol_particles.py", line 581, in _updateOutputPartSet
03481: self.readPartsFromMics(micList, outputParts)
03482: File "/projappl/project_2001566/apps/scipion/2.0/software/lib/python2.7/site-packages/relion/protocols/protocol_extract_particles.py", line 305, in readPartsFromMics
03483: sortByLabel=md.RLN_MICROGRAPH_NAME):
03484: File "/projappl/project_2001566/apps/scipion/2.0/pyworkflow/em/metadata/utils.py", line 97, in iterRows
03485: md.sort(sortByLabel)
03486: XmippError: Error code: 21 message: no such table: MDTable_2
03487: Sqlite query: INSERT INTO MDTable_3( "rlnCoordinateX", "rlnCoordinateY", "rlnImageName", "rlnMicrographName", "rlnMagnification", "rlnVoltage", "rlnDefocusU", "rlnDefocusV", "rlnDefocusAngle", "rlnSphericalAberration", "rlnBfactor", "rlnCtfScalefactor", "rlnPhaseShift", "rlnAmplitudeContrast", "rlnOriginX", "rlnOriginY", "rlnDetectorPixelSize") SELECT "rlnCoordinateX", "rlnCoordinateY", "rlnImageName", "rlnMicrographName", "rlnMagnification", "rlnVoltage", "rlnDefocusU", "rlnDefocusV", "rlnDefocusAngle", "rlnSphericalAberration", "rlnBfactor", "rlnCtfScalefactor", "rlnPhaseShift", "rlnAmplitudeContrast", "rlnOriginX", "rlnOriginY", "rlnDetectorPixelSize" FROM MDTable_2
03488: ------------------- PROTOCOL FAILED (DONE 1/2)

@pconesa
Copy link
Member

pconesa commented Oct 23, 2019

Sorry @JuhaHuiskonen , now I realized I did not read you correctly.

I've seen sets of almost 8M elements, but they were clearly unpracticable. 1M particle should work but you'll be waiting so long for some steps to finish or to visualize sets. Here our users (I've just asked) said that works but takes "TOO LONG". I'd say 1M, as it is now, challenges Scipion and it's clearly degrading scipion usability.

We have planned to invest time on this for the next release (we always planned for this)...but I believe this time has to happen.

@JuhaHuiskonen
Copy link
Collaborator Author

@pconesa OK, we will wait for the update and in the meanwhile split the set to smaller chunks.

@delarosatrevin
Copy link
Member

From the error log it seems like a bug in the Xmipp metadata class, when trying to execute the line:

md.sort(sortByLabel)  # while iterating thrown the star file rows

I have created an issue in the scipion-em-relion repo, we might consider to replace the use of the Xmipp's metadata (We will do it anyway for Relion 3.1 new star files handling)

@pconesa I don't know if you want to close this one or keep it as a reminder of this problem.

@pconesa
Copy link
Member

pconesa commented Oct 23, 2019

leave it....I'll address it with the others when improving performance

@pconesa pconesa self-assigned this Oct 23, 2019
@JuhaHuiskonen
Copy link
Collaborator Author

I was wondering if there will be a quick fix to md.sort(sortByLabel) or should we wait for Relion3.1 protocols?

@delarosatrevin
Copy link
Member

Hi @JuhaHuiskonen,
I don't know when the md.sort issue will be addressed in Xmipp, I don't have time myself to look into it. In the first week of Nov, I plan to start looking into Relion 3.1 and using another implementation to handling star files. So, I could start by implementing the particle extraction protocol for you to give it a try if you have already Relion 3.1 installed. The good thing with Relion 3.1 in Scipion is that you will not be stuck with this version and you will be able to easily swap back to 3.0. I'm sorry that you are stuck with this issue right now.

I'm wondering if this issue happened in streaming mode or not. Could you try to re-launch this protocol and try a batchSize=20, for example? In that way, I think the generated star files are parsed in smaller chunks and not the whole set.

@JuhaHuiskonen
Copy link
Collaborator Author

This helped us with errors related to large projects and SQL operations:

SQLITE_TMPDIR=/path/to/large/scratch/disk/
export SQLITE_TMPDIR

@delarosatrevin
Copy link
Member

Thanks Juha! I think we will keep this issue open as a reminder to check for more robust solutions.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants