AlphaFold3 feature_processing_multimer模块的 process_final
函数是 数据处理管道 中的最终步骤,它在合并和配对处理之后对 np_example
(一个包含特征的字典)进行一系列后处理操作。这些步骤旨在确保所有特征符合 AlphaFold3 训练和预测所需的格式,并为模型的输入做好准备。
process_final
的主要任务是:
- 调用一系列函数对
np_example
进行处理。 - 最终输出经过后处理的
np_example
,它包含了适合模型输入的特征。
源代码:
def process_final(np_example: Mapping[str, np.ndarray]
) -> Mapping[str, np.ndarray]:"""Final processing steps in data pipeline, after merging and pairing."""np_example = _correct_msa_restypes(np_example)np_example = _make_seq_mask(np_example)np_example = _make_msa_mask(np_example)np_example = _filter_features(np_example)return np_exampledef _correct_msa_restypes(np_example):"""Correct MSA restype to have the same order as residue_constants."""new_order_list = residue_constants.MAP_HHBLITS_AATYPE_TO_OUR_AATYPEnp_example['msa'] = np.take(new_order_list, np_example['msa'], axis=0)np_example['msa'] = np_example['msa'].astype(np.int32)return np_exampledef _make_seq_mask(np_example):np_example['seq_mask'] = (np_example['entity_id'] > 0).astype(np.float32)return np_exampledef _make_msa_mask(np_example):"""Mask features are all ones, but will later be zero-padded."""np_example['msa_mask'] = np.ones_like(np_example['msa'], dtype=np.float32)seq_mask = (np_example['entity_id'] > 0).astype(np.float32)np_example['msa_mask'] *= seq_mask[None]return np_exampleREQUIRED_FEATURES = frozenset({'aatype', 'all_atom_mask', 'all_atom_positions', 'all_chains_entity_ids','all_crops_all_chains_mask', 'all_crops_all_chains_positions','all_crops_all_chains_residue_ids', 'assembly_num_chains', 'asym_id','bert_mask', 'cluster_bias_mask', 'deletion_matrix', 'deletion_mean','entity_id', 'entity_mask', 'mem_peak', 'msa', 'msa_mask', 'num_alignments','num_templates', 'queue_size', 'residue_index', 'resolution','seq_length', 'seq_mask', 'sym_id', 'template_aatype','template_all_atom_mask', 'template_all_atom_positions'
})def _filter_features(np_examp