Skip to content

AMR phenotyping benchmarking tutorial

Kaixin edited this page Jan 20, 2024 · 20 revisions

Our AMR phenotyping benchmarking workflow is a collection of benchmarking datasets with predefined partitioning, adapted AMR phenotyping methods, each assigned with nested cross-validation or its derivatives evaluation approaches depending on the software's intrinsic design, a set of metrics, and summary tables and visualizations generation procedures.

In this tutorial, we provide a step-by-step fundamental guide for using our workflow to evaluate a model currently out of our benchmarking choice or that could be released in the feature by any third party. Two types of machine-learning (ML)-based AMR phenotyping methods are provided as examples.

  • The first is a classic ML model 1) that can use the scikit-learn module for training, 2) in which the feature matrix could be built without phenotype information, 3) encompasses several classifiers. Among the benchmarked methods in our study, Seq2Geno2Pheno falls into this category. Those methods could be evaluated via nested cross-validations, in which hyperparameter and classifier selections are performed at each iteration.
  • The second ML model 1) possesses an inherent hyperparameter optimization process, 2) encompasses several classifiers, 3) which are not supported by scikit-learn model. Among the benchmarked methods in our study, Kover falls into this category. Those methods could be evaluated via an iterative evaluation approach, accompanied by cross-validation to select the optimal classifier.

Table of Contents

1. Setup

  • Download our AMR benchmarking workflow
git clone https://github.com/hzi-bifo/AMR_benchmarking.git

1.1 Configuration file

  • Set the configure file Config.yaml in the root folder. You only need to set the following items; please maintain the default values for the rest.
option action values ([default])
dataset_location To where the PATRIC dataset will be downloaded. ~246G /vol/projects/BIFO/patric_genome
output_path To where to generate the Results folder for the direct results of each software and further visualization. ./
log_path To where to generate the log folder for the intermediate files (~10 TB, while regularly cleaning files related to completed benchmarking species). ./
n_jobs CPU cores (>1) to use. 10
amr_env_name conda env for general use amr_env

1.2 Dependencies

  • Linux OS and conda. Miniconda2 4.8.4 was used by us

1.3 Installation

  • Load the options set in Config.yaml in Linux shell
function parse_yaml {
   local prefix=$2
   local s=[[:space:]]*' w='[a-zA-Z0-9_]*' fs=$(echo @|tr @ '\034')
   sed -ne "s|^\($s\):|\1|" \
        -e "s|^\($s\)\($w\)$s:$s[\"']\(.*\)[\"']$s\$|\1$fs\2$fs\3|p" \
        -e "s|^\($s\)\($w\)$s:$s\(.*\)$s\$|\1$fs\2$fs\3|p"  $1 |
   awk -F$fs '{
      indent = length($1)/2;
      vname[indent] = $2;
      for (i in vname) {if (i > indent) {delete vname[i]}}
      if (length($3) > 0) {
         vn=""; for (i=0; i<indent; i++) {vn=(vn)(vname[i])("_")}
         printf("%s%s%s=\"%s\"\n", "'$prefix'",vn, $2, $3);
      }
   }'
}
eval $(parse_yaml Config.yaml)

export PATH="$PWD"/src:$PATH
export PYTHONPATH=$PWD
SCRIPT=$(realpath "$0")
SCRIPTPATH=$(dirname "$SCRIPT")
  • Create a conda environment, and install packages with conda in Linux shell
conda env create -n ${amr_env_name}  -f ./install/amr_env.yml python=3.7 
conda activate ${amr_env_name}

1.4 Download genome sequences

  • Download genome sequences via Linux shell
bash ./scripts/data_preprocess/retrieve_PATRIC_data.sh ${dataset_location}

2.The first example: ML-based methods nested cross-validation

  • Create a folder under the AMR_software directory, preferably named as the software's name, e.g. Pseudo in this tutorial
  • Copy Python files benchmarking.py and evaluate_ncv.py from Pseudo directory to your newly created fold

2.1 Feature building

  • Integrate the feature-building procedure into benchmarking.py via prepare_feature function
def prepare_feature(self):
    df_species,antibiotics=extract_infor(self.level,self.f_all,self.s,self.temp_path,self.software_name)
    for species, antibiotics in zip(df_species, antibiotics):
        antibiotics, _, _ =  load_data.extract_info(species, False, self.level)
        for anti in antibiotics:
            name,_,_ = name_utility.GETname_model(self.software_name,self.level, species, anti,'',self.temp_path)
            phenotype_list = pd.read_csv(name, index_col=0, dtype={'genome_id': object}, sep="\t")
            phenotype_list['path']=str(self.path_sequence) +'/'+ phenotype_list['genome_id'].astype(str)+'.fna'

            ################################################################
            '''
            phenotype_list is pandas matrix with each row representing a genome. E.G.
                genome_id	resistant_phenotype	path 
            0	470.2428	1                   <path_to_sequence>/470.2428.fna
            1	480119.5	0                   <path_to_sequence>/480119.5.fna
            2	470.2433	1                   <path_to_sequence>/470.2433.fna
            3	470.2159	1                   <path_to_sequence>/470.2159.fna
            4	470.2166	1                   <path_to_sequence>/470.2166.fna
            ...
            ...
            
            ![Please here add your codes for building features based on above information for your model]
            '''
            ## save the feature matrix to a folder under temp_path
            data_feature.to_csv('<path_to_feature>', sep="\t") 
            ################################################################

2.2 Nested cross-validation

  • Integrate the model into benchmarking.py via nested_cv function
    • Load the feature matrix
    • Specify the classifiers
    • Specify the model's classifiers' hyperparameter selection range
   def nested_cv(self): ### nested CV
       df_species,antibiotics=extract_infor(self.level,self.f_all,self.s,self.temp_path,self.software_name)
       for species, antibiotics in zip(df_species, antibiotics):

           ## antibiotics is the python list of benchmarked antibiotics for that species
           ## ID is the python list of PATRIC ID, e.g. [1352.10013,1352.10014, ..., 1354.10,1366.10]
           ## Y is the python list of phenotype for each sample in the same order as ID list
           antibiotics, ID, Y = load_data.extract_info(species, False, self.level)
           i_anti = 0
           for anti in antibiotics: ## evaluate each species-antibiotic dataset sequentially
               id_all = ID[i_anti]
               y_all = Y[i_anti]
               i_anti+=1

               '''                
               ! [Please specifiy the model's classifiers and feature matrix location here. ] 
               For example:
               CLASSIFIERS=['svm','lr', 'rf','lsvm']
               data_feature=pd.read_csv('<path_to_feature>', index_col=0,sep="\t")                                
               '''

               X_all = pd.concat([X_all, data_feature.reindex(X_all.index)], axis=1) ## load the feature matrix
               id_all = np.array(id_all)
               y_all = np.array(y_all)

               ## load in folds for nested CV
               p_names = name_utility.GETname_meta(species,anti,self.level)
               folds_txt=name_utility.GETname_folds(species,anti,self.level,self.f_kma,self.f_phylotree)
               folders_sample = json.load(open(folds_txt, "rb"))
               folders_index=name2index.Get_index(folders_sample,p_names) # CV folds
               for chosen_cl in CLASSIFIERS: # evaluate each classifier sequentially
                   _, _,save_name_score=name_utility.GETname_model(self.software_name, self.level,species, anti,chosen_cl,self.temp_path)
                   file_utility.make_dir(os.path.dirname(save_name_score))


                   ### metrics of 10 folds
                   mcc_test = []
                   f1_test = [] ## F1-macro
                   ## score_report could be used to extract metrics like precision-positive,
                   ## recall-positive, F1-positive, precision-negative, recall-negative, F1-negative, and accuracy
                   score_report_test = []
                   aucs_test = []

                   ### other outputs from nested CV
                   estimator_test=[] ##scikit-learn
                   hyperparameters_test = [] ## hyperparameters selected from inner loop CV for training in each of the 10 outer loop iteration
                   score_InnerLoop = []  ## the best metric scores of inner loops for each of the 10 outer loop iteration
                   index_InnerLoop=[] ## the order of the best hyperparameters in grid search
                   cv_results_InnerLoop=[] ## cv_results_ attributes of scikit-learn  sklearn.model_selection.GridSearchCV

                   ### testing results
                   sampleNames_test=[] ## sample PATRIC ID for a species-antibiotic combination/dataset
                   predictY_test=[] ## predicted AMR phenotype for each sample, ordered the same as sampleNames_test
                   true_Y=[] ## ground truth AMR phenotype for each sample, ordered the same as sampleNames_test



                   for out_cv in range(self.cv): ## outer loop of nested CV
                       print(species,anti,'. Starting outer: ', str(out_cv), '; chosen_cl: ', chosen_cl)

                       test_samples_index = folders_index[out_cv] ## a list of index of the test fold
                       id_test = id_all[test_samples_index]  ## sample name list of the test fold
                       y_test = y_all[test_samples_index] ## ground truth phenotype
                       train_val_index =folders_index[:out_cv] +folders_index[out_cv + 1:] ## 9 folds involved in inner loop for CV
                       id_val_train = id_all[list(itertools.chain.from_iterable(train_val_index))]  ## sample name list of the 9 folds
                       y_val_train = y_all[list(itertools.chain.from_iterable(train_val_index))] ## phenotype of the 9 folds
                       X_val_train=X_all.loc[id_val_train,:] ## feature matrix of samples from the 9 folds
                       X_test=X_all.loc[id_test,:] ##  feature matrix of samples of the test fold

                       '''
                       ! [Please specify the model's classifiers' hyperparameter selection range here. ] 
                       For example:
                       cl = RandomForestClassifier(random_state=1)
                       hyper_space hyper_space = [
                             {
                               "cl__n_estimators": [100, 200, 500, 1000],
                               "cl__criterion": ["entropy", "gini"],
                               "cl__max_features": ["auto"],
                               "cl__min_samples_split": [2,5,10],
                               "cl__min_samples_leaf": [1],
                               "cl__class_weight": ["balanced", None]
                             }
                           ]                         
                       '''
                       ###############################################################################################
                       ## typical procedures for nested CV inner loop hyperparameter selection.
                       ## note: this part can be vaired based on specific techniques, e.g. for Kover, this can be replaced with bound selection,
                       ## for neural networks model, this should be replaced with a hyperparameter optimization procedure accompanied by early stopping
                       ## (see function training in AMR_software/AytanAktug/nn/hyperpara.py)
                       ###############################################################################################
                       ### Grid search for hyperparameter selection
                       pipe = Pipeline(steps=[('cl', cl)])
                       search = GridSearchCV(estimator=pipe, param_grid=hyper_space, n_jobs=self.n_jobs,
                                                 scoring='f1_macro',
                                                 cv=create_generator(train_val_index), refit=True)


                       search.fit(X_all, y_all)
                       hyperparameters_test_sub=search.best_estimator_
                       scores_best=search.best_score_
                       index_best=search.best_index_
                       cv_results=search.cv_results_
                       current_pipe=hyperparameters_test_sub

                       # retrain on train and validation data using the optimal hyperparameters
                       current_pipe.fit(X_val_train, y_val_train)
                       y_test_pred = current_pipe.predict(X_test)
                       ###############################################################################################


                       # calculate metric scores
                       f1 = f1_score(y_test, y_test_pred, average='macro')
                       report = classification_report(y_test, y_test_pred, labels=[0, 1], output_dict=True)
                       mcc = matthews_corrcoef(y_test, y_test_pred)
                       fpr, tpr, _ = roc_curve(y_test, y_test_pred, pos_label=1)
                       roc_auc = auc(fpr, tpr)
                       f1_test.append(f1)
                       score_report_test.append(report)
                       aucs_test.append(roc_auc)
                       mcc_test.append(mcc)
                       hyperparameters_test.append(hyperparameters_test_sub)
                       score_InnerLoop.append(scores_best)
                       index_InnerLoop.append(index_best)
                       cv_results_InnerLoop.append(cv_results)
                       predictY_test.append( y_test_pred.tolist())
                       true_Y.append(y_test.tolist())
                       sampleNames_test.append(folders_sample[out_cv])
                       estimator_test.append(current_pipe)

                   ### Save metric scores and other model evaluation information
                   score ={'f1_test':f1_test,'score_report_test':score_report_test,'aucs_test':aucs_test,'mcc_test':mcc_test,
                        'predictY_test':predictY_test,'ture_Y':true_Y,'samples':sampleNames_test}
                   score2= {'hyperparameters_test':hyperparameters_test,'estimator_test':estimator_test,
                            'score_InnerLoop':score_InnerLoop,'index_InnerLoop':index_InnerLoop,'cv_results_InnerLoop':cv_results_InnerLoop}
                   with open(save_name_score + '_KMA_' + str(self.f_kma) + '_Tree_' + str(self.f_phylotree) + '.json',
                             'w') as f:  # overwrite mode
                       json.dump(score, f)
                   with open(save_name_score + '_kma_' + str(self.f_kma) + '_tree_' + str(self.f_phylotree) + '_model.pickle',
                             'wb') as f:  # overwrite mode
                       pickle.dump(score2, f)

2.3 Run the workflow

  • Usage
usage: evaluate_ncv.py [-h] -software SOFTWARE_NAME -sequence PATH_SEQUENCE [-temp TEMP_PATH] [-l LEVEL] [-f_all] [-f_phylotree] [-f_kma] [-s SPECIES [SPECIES ...]] [-f_ml] [-cv CV_NUMBER]
                   [-n_jobs N_JOBS]

options:
  -h, --help            show this help message and exit
  -software SOFTWARE_NAME, --software_name SOFTWARE_NAME
                        software_name
  -sequence PATH_SEQUENCE, --path_sequence PATH_SEQUENCE
                        Path of the directory with PATRIC sequences
  -temp TEMP_PATH, --temp_path TEMP_PATH
                        Directory to store temporary files, like features.
  -l LEVEL, --level LEVEL
                        Quality control: strict or loose
  -f_all, --f_all       Benchmarking on all the possible species.
  -f_phylotree, --f_phylotree
                        phylogeny-aware evaluation
  -f_kma, --f_kma       homology-aware evaluation
  -s SPECIES [SPECIES ...], --species SPECIES [SPECIES ...]
                        species to run: e.g.'Pseudomonas aeruginosa' 'Klebsiella pneumoniae' 'Escherichia coli' 'Staphylococcus aureus' 'Mycobacterium tuberculosis' 'Salmonella enterica'
                        'Streptococcus pneumoniae' 'Neisseria gonorrhoeae'
  -f_ml, --f_ml         Perform neseted CV on ML models
  -cv CV_NUMBER, --cv_number CV_NUMBER
                        CV splits number. Default=10
  -n_jobs N_JOBS, --n_jobs N_JOBS
                        Number of jobs to run in parallel.


  • Example: phylogeny-based evaluation of software named "Pseudo" on E. coli in the Linux shell
python AMR_software/Pseudo/evaluate_ncv.py -software 'Pseudo' -s 'Escherichia coli' -sequence '/vol/projects/patric_genome' -temp ${log_path} -f_phylotree -cv 10 -n_jobs 10

2.4 Generate summary table

  • Usage
usage: result_analysis.py [-h] -software SOFTWARENAME [-cl_list CL_LIST [CL_LIST ...]] [-temp TEMP_PATH] [-out OUTPUT_PATH] [-l LEVEL] [-cv CV_NUMBER] [-f_all] [-s SPECIES [SPECIES ...]] [-f_phylotree] [-f_kma]

options:
  -h, --help            show this help message and exit
  -software SOFTWARENAME, --softwareName SOFTWARENAME
                        Software name.
  -cl_list CL_LIST [CL_LIST ...], --cl_list CL_LIST [CL_LIST ...]
                        classifiers.
  -temp TEMP_PATH, --temp_path TEMP_PATH
                        Directory to store temporary files.
  -out OUTPUT_PATH, --output_path OUTPUT_PATH
                        Directory to store CV scores.
  -l LEVEL, --level LEVEL
                        Quality control: strict or loose
  -cv CV_NUMBER, --cv_number CV_NUMBER
                        CV splits number. Default=10
  -f_all, --f_all       all the possible species
  -s SPECIES [SPECIES ...], --species SPECIES [SPECIES ...]
                        species to run: e.g.'seudomonas aeruginosa' 'Klebsiella pneumoniae' 'Escherichia coli' 'Staphylococcus aureus' 'Mycobacterium tuberculosis' 'Salmonella enterica'
                        'Streptococcus pneumoniae' 'Neisseria gonorrhoeae'
  -f_phylotree, --f_phylotree
                        phylogeny-aware evaluation
  -f_kma, --f_kma        homology-aware evaluation
  • Example
    • The software is equipped with a support vector machine and a logistic regression.
    • Generate a summary table of software named "Pseudo" on E. coli, under phylogeny-based evaluation.
    • Output file Escherichia_coli_kma_False_tree_True_SummaryBenchmarking.txt can be found at <output_path>/Results/software/Pseudo/Escherichia_coli in the form shown below.
python ./src/analysis_utility/result_analysis.py -software 'Pseudo' -f_phylotree  -cl_list 'svm' 'lr'  -cv ${cv_number} -temp ${log_path} -o ${output_path} -s 'Escherichia coli' 
f1_macro f1_positive f1_negative accuracy precision_macro recall_macro precision_negative recall_negative precision_positive recall_positive clinical_f1_negative clinical_precision_negative clinical_recall_negative
amoxicillin 0.86±0.03 0.89±0.05 0.83±0.05 0.87±0.03 0.86±0.04 0.87±0.04 0.80±0.09 0.88±0.10 0.93±0.05 0.85±0.08 0.843507214 0.798319328 0.894117647
amoxicillin/clavulanic acid 0.69±0.07 0.59±0.14 0.79±0.08 0.74±0.07 0.69±0.07 0.69±0.07 0.81±0.07 0.77±0.11 0.57±0.13 0.62±0.17 0.798613518 0.818763326 0.779431664
ampicillin 0.85±0.08 0.91±0.04 0.78±0.15 0.89±0.05 0.88±0.06 0.84±0.09 0.82±0.11 0.79±0.23 0.94±0.06 0.89±0.08 0.84501845 0.831215971 0.859287054
aztreonam 0.74±0.18 0.61±0.35 0.87±0.22 0.90±0.11 0.79±0.18 0.74±0.17 0.93±0.06 0.86±0.27 0.65±0.38 0.62±0.38 0.935217903 0.956626506 0.914746544
cefotaxime 0.73±0.19 0.40±0.37 0.96±0.05 0.94±0.06 0.73±0.21 0.68±0.18 0.94±0.08 0.98±0.03 0.51±0.46 0.37±0.39 0.964926591 0.948676824 0.981742739
ceftazidime 0.71±0.15 0.48±0.31 0.95±0.05 0.92±0.08 0.76±0.17 0.71±0.16 0.95±0.08 0.96±0.04 0.58±0.37 0.46±0.34 0.956521739 0.948026316 0.965170797
ceftriaxone 0.80±0.09 0.84±0.07 0.76±0.21 0.87±0.05 0.83±0.10 0.80±0.09 0.79±0.19 0.76±0.24 0.88±0.10 0.83±0.13 0.863157895 0.863157895 0.863157895
cefuroxime 0.58±0.13 0.25±0.28 0.90±0.05 0.84±0.07 0.58±0.14 0.58±0.12 0.89±0.07 0.92±0.06 0.27±0.30 0.24±0.27 0.908468777 0.892436975 0.925087108
ciprofloxacin 0.76±0.17 0.50±0.38 0.92±0.07 0.90±0.09 0.73±0.19 0.70±0.16 0.92±0.10 0.92±0.08 0.55±0.40 0.48±0.39 0.933075435 0.919908467 0.946624804
gentamicin 0.87±0.15 0.76±0.30 0.98±0.02 0.97±0.03 0.90±0.15 0.86±0.16 0.98±0.02 0.98±0.03 0.82±0.30 0.74±0.32 0.983481576 0.982857143 0.984106802
piperacillin/tazobactam 0.53±0.09 0.19±0.18 0.87±0.14 0.80±0.15 0.55±0.07 0.57±0.10 0.93±0.04 0.84±0.19 0.16±0.15 0.30±0.30 0.884810621 0.933278418 0.841128434
tetracycline 0.80±0.08 0.83±0.10 0.77±0.12 0.82±0.07 0.83±0.07 0.81±0.08 0.77±0.15 0.82±0.19 0.90±0.08 0.79±0.18 0.795081967 0.746153846 0.850877193
trimethoprim 0.84±0.14 0.78±0.28 0.90±0.06 0.90±0.04 0.85±0.15 0.84±0.13 0.89±0.08 0.92±0.09 0.81±0.31 0.76±0.29 0.913978495 0.894736842 0.934065934

3. The second example: ML-based methods with iterative evaluation

  • Create a folder under the AMR_software directory, preferably named as the software's name, e.g. Pseudo in this tutorial
  • Copy Python files benchmarking.py, evaluate_iter.py, run_iteration.sh, and run_val.sh from Pseudo directory to your newly created fold

3.1 Feature building

  • Integrate the feature-building procedure for iteration evaluation into benchmarking.py via prepare_feature function (see above 2.1 Feature building)
  • Integrate the feature-building procedure for classifier selection in the CV into benchmarking.py via prepare_feature_val function
def prepare_feature_val(self):
'''
This feature-preparing procedure is for ML models 
1) possessing an inherent hyperparameter optimization process, which means nested CV is not needed;
but 2) encompassing several classifiers, which means we should still design a CV on the 9 training folds at each iteration
to select the optimal classifier
'''

df_species,antibiotics=extract_infor(self.level,self.f_all,self.s,self.temp_path,self.software_name)
for species, antibiotics in zip(df_species, antibiotics):
    ### load antibiotics for this species; load sample PATRIC IDs for this species-antibiotic combination
    antibiotics, ID, _ =  load_data.extract_info(species, False, self.level)
    i_anti = 0

    for anti in antibiotics:
	id_all = ID[i_anti]  # sample names of all the 10 folds
	i_anti += 1
	id_all = np.array(id_all)

	### 1. load CV folds
	p_names = name_utility.GETname_meta(species,anti,self.level)
	folds_txt=name_utility.GETname_folds(species,anti,self.level,self.f_kma,self.f_phylotree)
	folders_sample = json.load(open(folds_txt, "rb"))
	folders_index=name2index.Get_index(folders_sample,p_names) # CV folds

	for out_cv in range(self.cv):
	    test_samples_index = folders_index[out_cv]
	    train_val_train_index = folders_index[:out_cv] + folders_index[out_cv + 1:]

	    for inner_cv in range(self.cv-1):                        
		val_index=train_val_train_index[inner_cv]
		train_index=train_val_train_index[:inner_cv] + train_val_train_index[inner_cv+1 :]                        
		id_train = id_all[list(itertools.chain.from_iterable(train_index))]  ## sample name list of CV training set
		id_val = id_all[val_index]  ## sample name list of CV test set
		
		#########################################################################################################
		## genome sequence location. Please use this information to load in sequence for building feature matrix.
		sequence_train=[str(self.path_sequence) +'/'+ genome_id.astype(str)+'.fna' for genome_id in id_train]
		sequence_testn=[str(self.path_sequence) +'/'+ genome_id.astype(str)+'.fna' for genome_id in id_val]
		'''
		2. prepare meta/feature files for this iteration of training samples
		 only retain those in the training and validation CV folds    
				    
		 ![Please here add your codes for building features]                    
		
		 '''
		## save the feature matrix to a folder under temp_path
		data_feature.to_csv('<path_to_feature>', sep="\t")
		#########################################################################################################

3.2 Iterative evaluation

  • Integrate the model into run_iteration.sh
#!/bin/bash
species="$1"
feature_path="$2"
n_jobs="$3"

readarray -t Anti_List <  ${feature_path}/${species}/anti_list
for anti in ${Anti_List[@]};do
    for j in {0..9};do
	##########################################################################################    
	#### Software-specific command here to train the 9 training folds and test on the test fold
	##########################################################################################    
    done
    wait
done

3.3 Classifier selection via cross-validation

  • Integrate the model into run_val.sh
#!/bin/bash
species="$1"
feature_path="$2"
n_jobs="$3"

readarray -t Anti_List <  ${feature_path}/${species}/anti_list
for anti in ${Anti_List[@]};do
    for j in {0..9};do
        for i in  {0..8};do
          ##########################################################################################
          #### Software-specific command here to train the 8 training folds and test on the test fold
          ##########################################################################################
        done
    done
done

3.4 Run the workflow

  • Example: phylogeny-based evaluation of software named "Pseudo" on E. coli in the Linux shell
    • Prepare features
    • Iterative evaluation over the 10 folds
    • Cross-validation within the training set at each iteration to select the optimal classifier
python AMR_software/Pseudo/evaluate_iter.py -software 'Pseudo' -s 'Escherichia coli' -sequence '/vol/projects/patric_genome' -temp ${log_path} -f_phylotree -cv 10 -n_jobs ${n_jobs}
bash  AMR_software/Pseudo/run_iteration.sh 'Escherichia coli' <feature_path>  ${amr_env_name}
bash  AMR_software/Pseudo/run_val.sh 'Escherichia coli' <feature_path>  ${amr_env_name}

3.5 Generate summary table

  • Need to design the codes depending on software-specific report format