Package Biskit :: Package Mod :: Module Analyse :: Class Analyse
[hide private]
[frames] | no frames]

Class Analyse

source code

Create a folder named analyse in the project root folder that contains the following results:

GLOBAL: global rmsd: all atoms, c-alpha only, percentage of identities, Modeller score, and the number of templates

LOCAL: results of the cross validation, rmsd per residue c-alpha only for each templates and the mean rmsd

3D structure: pickle down the final.pdb that is the best model of the project and the mean rmsd is set inside (temperature_factor)

Instance Methods [hide private]
  __init__(self, outFolder, log=None)
  prepareFolders(self)
Create folders needed by this class.
[[str]] parseFile(self, name)
Parse a identity matrix file
[str] __listDir(self, path)
List all the files and folders in a directory with the exceprion of ...
dict, dict global_rmsd_aa(self, validation_folder=None)
Global RMSD values.
dict, dict global_rmsd_ca(self, validation_folder=None)
Global RMSD CA values.
{str:float} get_identities(self, nb_templates, validation_folder=None)
Calculate the mean of the percentage of identities for each template with the others.
{str:float} get_score(self, validation_folder=None)
Get the best global modeller score for each template re-modeled
  output_values(self, rmsd_aa_wo_if, rmsd_aa_if, rmsd_ca_wo_if, rmsd_ca_if, identities, score, nb_templates, output_file=None)
Write result to file.
dict get_aln_info(self, output_folder=None)
Collect alignment information.
dict get_templates_rmsd(self, templates)
Collect RMSD values between all the templates.
dict templates_profiles(self, templates, aln_dic, template_rmsd_dic)
Collect RMSD profiles of each template with the target and their %ID.
dict output_cross_val(self, aln_dic, templates_profiles, templates, model, output_file=None)
Calculates the mean rmsd of the model to the templates and write the result to a file.
  updatePDBs_charge(self, mean_rmsd_atoms, model)
pickle down the final.pdb which is judged to be the best model of the project.
  go(self, output_folder=None, template_folder=None)
Run analysis of models.

Class Variables [hide private]
  F_RESULT_FOLDER = '/analyse'
  F_TEMPLATE_FOLDER = '/validation'
  F_PDBModels = F_RESULT_FOLDER+ '/PDBModels.list'
  F_MODELS = Modeller.F_RESULT_FOLDER+ Modeller.F_PDBModels
  F_INPUT_ALNS = '/t_coffee/final.pir_aln'
  F_INPUT_RMSD = '/benchmark'
  F_RMSD_AA = F_RESULT_FOLDER+ '/rmsd_aa.out'
  F_RMSD_CA = F_RESULT_FOLDER+ '/rmsd_ca.out'
  F_OUTPUT_VALUES = F_RESULT_FOLDER+ '/global_results.out'
  F_CROSS_VAL = F_RESULT_FOLDER+ '/local_results.out'
  F_FINAL_PDB = F_RESULT_FOLDER+ '/final.pdb'

Method Details [hide private]

__init__(self, outFolder, log=None)
(Constructor)

source code 
Parameters:
  • outFolder (str) - base folder for output
  • log (LogFile instance or None) - None reports to STDOUT

prepareFolders(self)

source code 

Create folders needed by this class.

parseFile(self, name)

source code 

Parse a identity matrix file
Parameters:
  • name (str) - file to parse
Returns: [[str]]
contents of parsed file

__listDir(self, path)

source code 

List all the files and folders in a directory with the exceprion of ...
Parameters:
  • path (str) - dir to list
Returns: [str]
list of files

global_rmsd_aa(self, validation_folder=None)

source code 

Global RMSD values.
Parameters:
  • validation_folder (str) - folder vith validation data (defult: None → outFolder/F_TEMPLATE_FOLDER)
Returns: dict, dict
two dictionaries:
  • rmsd_aa_wo_if: global all atom rmsd for each template without iterative fitting
  • rmsd_aa_if: global all atom rmsd for each templates with iterative fitting

global_rmsd_ca(self, validation_folder=None)

source code 

Global RMSD CA values.
Parameters:
  • validation_folder (str) - folder vith validation data (defult: None → outFolder/F_TEMPLATE_FOLDER)
Returns: dict, dict
two dictionaries:
  • rmsd_ca_wo_if: global CA rmsd for each template without iterative fitting
  • rmsd_ca_if: global CA rmsd for each template with iterative fitting

get_identities(self, nb_templates, validation_folder=None)

source code 

Calculate the mean of the percentage of identities for each template with the others.
Parameters:
  • nb_templates (int) - number of templates used in the cross-validation
  • validation_folder (str) - folder vith validation data (defult: None → outFolder/F_TEMPLATE_FOLDER)
Returns: {str:float}
dictionary with mean percent identities for each template

get_score(self, validation_folder=None)

source code 

Get the best global modeller score for each template re-modeled
Parameters:
  • validation_folder (str) - folder vith validation data (defult: None → outFolder/F_TEMPLATE_FOLDER)
Returns: {str:float}
dictionary with modeller score for each template

output_values(self, rmsd_aa_wo_if, rmsd_aa_if, rmsd_ca_wo_if, rmsd_ca_if, identities, score, nb_templates, output_file=None)

source code 

Write result to file.
Parameters:
  • rmsd_aa_wo_if ({str:[float,float]}) - Rmsd for heavy atoms and normal fit. Data should be a dictionary mapping pdb codes to a list containing the rmsd value and the percent of discharded atoms in the rmsd calculation.
  • rmsd_aa_if ({str:[float,float]}) - Rmsd for heavy atoms, iterative fit.
  • rmsd_ca_wo_if ({str:[float,float]}) - rmsd for only CA, normal fit.
  • rmsd_ca_if ({str:[float,float]}) - Rmsd for only CA, iterative fit.
  • identities ({str:float}) - mean identity to template, dictionary mapping pdb codes to identity values
  • score (float) - score calculated by Modeller
  • nb_templates (int) - number of templates used for re-modeling
  • output_file (str) - file to write (default: None → outFolder/F_OUTPUT_VALUES)

get_aln_info(self, output_folder=None)

source code 

Collect alignment information.
Parameters:
  • output_folder (str) - output folder (default: None → outFolder)
Returns: dict
{'name':'target, 'seq': 'sequence of the target'}

get_templates_rmsd(self, templates)

source code 

Collect RMSD values between all the templates.
Parameters:
  • templates ([str]) - name of the different templates
Returns: dict
template_rmsd_dic, contains all the rmsd per residues of all the templates

templates_profiles(self, templates, aln_dic, template_rmsd_dic)

source code 

Collect RMSD profiles of each template with the target and their %ID.
Parameters:
  • templates ([str]) - name of the different templates
  • aln_dic (dict) - contains all the informations between the target and its templates from the alignment
  • template_rmsd_dic (dict) - contains all the rmsd per residues of all the templates
Returns: dict
template_profiles, contains all the profile rmsd of each template with the target and their %ID

output_cross_val(self, aln_dic, templates_profiles, templates, model, output_file=None)

source code 

Calculates the mean rmsd of the model to the templates and write the result to a file.
Parameters:
  • aln_dic (dict) - contains all the informations between the target and its templates from the alignment
  • templates_profiles (dict) - contains all the profile rmsd of each template with the target and their %ID
  • templates ([str]) - name of the different templates
  • model (PDBModel) - model
  • output_file (str) - output file (default: None → outFolder/F_CROSS_VAL)
Returns: dict
mean_rmsd, dictionary with the mean rmsd of the model to the templates.

updatePDBs_charge(self, mean_rmsd_atoms, model)

source code 

pickle down the final.pdb which is judged to be the best model of the project. The mean rmsd to the templates is written to the temperature_factor column.
Parameters:
  • mean_rmsd_atoms ([int]) - mean rmsd for each atom of the target's model
  • model (PDBModel) - target's model with the highest modeller score

go(self, output_folder=None, template_folder=None)

source code 

Run analysis of models.
Parameters:
  • output_folder (str) - folder for result files (default: None → outFolder/F_RESULT_FOLDER)
  • template_folder (str) - folder with template structures (default: None → outFolder/VS.F_RESULT_FOLDER)

Class Variable Details [hide private]

F_RESULT_FOLDER

Value:
'/analyse'                                                             
      

F_TEMPLATE_FOLDER

Value:
'/validation'                                                          
      

F_PDBModels

Value:
F_RESULT_FOLDER+ '/PDBModels.list'                                     
      

F_MODELS

Value:
Modeller.F_RESULT_FOLDER+ Modeller.F_PDBModels                         
      

F_INPUT_ALNS

Value:
'/t_coffee/final.pir_aln'                                              
      

F_INPUT_RMSD

Value:
'/benchmark'                                                           
      

F_RMSD_AA

Value:
F_RESULT_FOLDER+ '/rmsd_aa.out'                                        
      

F_RMSD_CA

Value:
F_RESULT_FOLDER+ '/rmsd_ca.out'                                        
      

F_OUTPUT_VALUES

Value:
F_RESULT_FOLDER+ '/global_results.out'                                 
      

F_CROSS_VAL

Value:
F_RESULT_FOLDER+ '/local_results.out'                                  
      

F_FINAL_PDB

Value:
F_RESULT_FOLDER+ '/final.pdb'