Sourcery refactored master branch #18

sourcery-ai · 2022-11-08T17:37:04Z

Branch master refactored by Sourcery.

If you're happy with these changes, merge this Pull Request using the Squash and merge strategy.

See our documentation here.

Run Sourcery locally

Reduce the feedback loop during development by using the Sourcery editor plugin:

Review changes via command line

To manually merge these changes, make sure you're on the master branch, then run:

git fetch origin sourcery/master
git merge --ff-only FETCH_HEAD
git reset HEAD^

Help us improve this pull request!

sourcery-ai · 2022-11-08T17:37:06Z

setup.py

                return line.split(delim)[1]
-        else:
-            raise RuntimeError("Unable to find version string.")
+        raise RuntimeError("Unable to find version string.")


Function get_version refactored with the following changes:

If else clause is always executed move code to same level as loop (useless-else-on-loop)

sourcery-ai · 2022-11-08T17:37:07Z

tragec/__init__.py


 for _, model, _ in pkgutil.iter_modules([str(Path(__file__).parent / 'models')]):
-    imported_module = importlib.import_module('.models.' + model, package=__name__)
+    imported_module = importlib.import_module(f'.models.{model}', package=__name__)


Lines 13-13 refactored with the following changes:

Use f-string instead of string concatenation (use-fstring-for-concatenation)

sourcery-ai · 2022-11-08T17:37:07Z

tragec/datasets.py

        data_path = Path(data_path)
        data_file = f'refseq/maps{max_seq_len}/refseq_{split}.lmdb'
-        refseq_file = f'refseq/refseq.lmdb'
+        refseq_file = 'refseq/refseq.lmdb'


Function GeCMaskedReconstructionDataset.__init__ refactored with the following changes:

Replace f-string with no interpolated values with string (remove-redundant-fstring)

sourcery-ai · 2022-11-08T17:37:07Z

tragec/datasets.py

-                else:
-                    # 10% chance to keep current representation
-                    pass
-


Function GeCMaskedReconstructionDataset._apply_pseudobert_mask refactored with the following changes:

Remove redundant pass statement (remove-redundant-pass)

This removes the following comments ( why? ):

# 10% chance to keep current representation

sourcery-ai · 2022-11-08T17:37:07Z

tragec/datasets.py

-                else:
-                    # 10% chance to keep current token
-                    pass
-


Function ProteinMaskedLanguageModelingDataset._apply_bert_mask refactored with the following changes:

Remove redundant pass statement (remove-redundant-pass)

This removes the following comments ( why? ):

# 10% chance to keep current token

sourcery-ai · 2022-11-08T17:37:09Z

tragec/models/configuration.py

-                logger.error("Couldn't reach server at '{}' to download pretrained model "
-                             "configuration file.".format(config_file))
+                logger.error(
+                    f"Couldn't reach server at '{config_file}' to download pretrained model configuration file."
+                )
+
            else:
                logger.error(
-                    "Model name '{}' was not found in model name list ({}). "
-                    "We assumed '{}' was a path or url but couldn't find any file "
-                    "associated to this path or url.".format(
-                        pretrained_model_name_or_path,
-                        ', '.join(cls.pretrained_config_archive_map.keys()),
-                        config_file))
+                    f"Model name '{pretrained_model_name_or_path}' was not found in model name list ({', '.join(cls.pretrained_config_archive_map.keys())}). We assumed '{config_file}' was a path or url but couldn't find any file associated to this path or url."
+                )
+
            raise
        if resolved_config_file == config_file:
-            logger.info("loading configuration file {}".format(config_file))
+            logger.info(f"loading configuration file {config_file}")
        else:
-            logger.info("loading configuration file {} from cache at {}".format(
-                config_file, resolved_config_file))
+            logger.info(
+                f"loading configuration file {config_file} from cache at {resolved_config_file}"
+            )
+


Function BioConfig.from_pretrained refactored with the following changes:

Replace call to format with f-string [×4] (use-fstring-for-formatting)

sourcery-ai · 2022-11-08T17:37:09Z

tragec/models/configuration.py

        """Serializes this instance to a Python dictionary."""
-        output = copy.deepcopy(self.__dict__)
-        return output
+        return copy.deepcopy(self.__dict__)


Function BioConfig.to_dict refactored with the following changes:

Inline variable that is immediately returned (inline-immediately-returned-variable)

sourcery-ai · 2022-11-08T17:37:09Z

tragec/models/modeling.py

-                "Parameter config in `{}(config)` should be an instance of class "
-                "`BioConfig`. To create a model from a pretrained model use "
-                "`model = {}.from_pretrained(PRETRAINED_MODEL_NAME)`".format(
-                    self.__class__.__name__, self.__class__.__name__
-                ))
+                f"Parameter config in `{self.__class__.__name__}(config)` should be an instance of class `BioConfig`. To create a model from a pretrained model use `model = {self.__class__.__name__}.from_pretrained(PRETRAINED_MODEL_NAME)`"
+            )
+


Function BioModel.__init__ refactored with the following changes:

Replace call to format with f-string (use-fstring-for-formatting)

sourcery-ai · 2022-11-08T17:37:10Z

tragec/models/modeling.py

-                    p for n, p in param_optimizer if not any(nd in n for nd in no_decay)
+                    p
+                    for n, p in param_optimizer
+                    if all(nd not in n for nd in no_decay)
                ],
                "weight_decay": 0.01,
            },
            {
                "params": [
-                    p for n, p in param_optimizer if any(nd in n for nd in no_decay)
+                    p
+                    for n, p in param_optimizer
+                    if any(nd in n for nd in no_decay)
                ],
                "weight_decay": 0.0,
            },
        ]
+


Function BioModel.configure_optimizers refactored with the following changes:

Invert any/all to simplify comparisons (invert-any-all)

Use named expression to simplify assignment and conditional [×2] (use-named-expression)

sourcery-ai · 2022-11-08T17:37:10Z

tragec/models/models_bert.py

@@ -1,5 +1,6 @@
 """PyTorch BERT model. """



Lines 12-15 refactored with the following changes:

Use f-string instead of string concatenation [×4] (use-fstring-for-concatenation)

sourcery-ai · 2022-11-08T17:37:14Z

tragec/models/models_funnel.py

-            self.block_sizes = block_sizes
-        else:
-            self.block_sizes = [num_hidden_layers // 3] * 3
+        self.block_sizes = block_sizes or [num_hidden_layers // 3] * 3


Function BioFunnelConfig.__init__ refactored with the following changes:

Replace if statement with if expression (assign-if-exp)

Simplify if expression by using or (or-if-exp-identity)

sourcery-ai · 2022-11-08T17:37:14Z

tragec/models/tape_model.py

-            archive_file = cls.pretrained_model_archive_map[pretrained_model_name_or_path]
+            return cls.pretrained_model_archive_map[pretrained_model_name_or_path]
        elif os.path.isdir(pretrained_model_name_or_path):
-            archive_file = os.path.join(pretrained_model_name_or_path, WEIGHTS_NAME)
+            return os.path.join(pretrained_model_name_or_path, WEIGHTS_NAME)
        else:
-            archive_file = pretrained_model_name_or_path
-        return archive_file
+            return pretrained_model_name_or_path


Function TAPEModelMixin._get_model refactored with the following changes:

Lift return into if (lift-return-into-if)

sourcery-ai · 2022-11-08T17:37:14Z

tragec/models/tape_model.py

-        new_keys = {}
-        for key in state_dict.keys():
-            new_keys[key] = cls._rewrite_module_name(key)
+        new_keys = {key: cls._rewrite_module_name(key) for key in state_dict}


Function TAPEModelMixin._rewrite_state_dict refactored with the following changes:

Convert for loop into dictionary comprehension (dict-comprehension)

Remove unnecessary call to keys() (remove-dict-keys)

sourcery-ai · 2022-11-08T17:37:14Z

tragec/models/tape_model.py

-                    any(s.startswith(cls.base_model_prefix) for s in state_dict.keys()):
-                start_prefix = cls.base_model_prefix + '.'
+                        any(s.startswith(cls.base_model_prefix) for s in state_dict.keys()):
+                start_prefix = f'{cls.base_model_prefix}.'
            if hasattr(model, cls.base_model_prefix) and \
-                    not any(s.startswith(cls.base_model_prefix) for s in state_dict.keys()):
+                        not any(s.startswith(cls.base_model_prefix) for s in state_dict.keys()):
                model_to_load = getattr(model, cls.base_model_prefix)

        load(model_to_load, prefix=start_prefix)
-        if len(missing_keys) > 0:
+        if missing_keys:
            logger.info("Weights of {} not initialized from pretrained model: {}".format(
                model.__class__.__name__, missing_keys))
-        if len(unexpected_keys) > 0:
+        if unexpected_keys:
            logger.info("Weights from pretrained model not used in {}: {}".format(
                model.__class__.__name__, unexpected_keys))
-        if len(error_msgs) > 0:
+        if error_msgs:


Function TAPEModelMixin.from_pretrained refactored with the following changes:

Use f-string instead of string concatenation (use-fstring-for-concatenation)

Simplify sequence length comparison [×3] (simplify-len-comparison)

sourcery-ai · 2022-11-08T17:37:14Z

tragec/models/utils_t5.py

-            [T5Block(config, has_relative_attention_bias=bool(i == 0)) for i in range(config.num_layers)]
+            [
+                T5Block(config, has_relative_attention_bias=i == 0)
+                for i in range(config.num_layers)
+            ]
        )
+


Function T5Stack.__init__ refactored with the following changes:

Remove unnecessary casts to int, str, float or bool (remove-unnecessary-cast)

sourcery-ai · 2022-11-08T17:37:16Z

tragec/tasks/task_seq2seqclass.py

-        outputs = sequence_logits
-
-        return outputs
+        return self.classify(sequence_output)


Function SequenceToSequenceClassificationHead.forward refactored with the following changes:

Inline variable that is immediately returned [×2] (inline-immediately-returned-variable)

sourcery-ai · 2022-11-08T17:37:16Z

tragec/tasks/task_singleclass.py

-        logits = self.classify(pooled_output)
-
-        return logits
+        return self.classify(pooled_output)


Function SequenceClassificationHead.forward refactored with the following changes:

Inline variable that is immediately returned (inline-immediately-returned-variable)

sourcery-ai · 2022-11-08T17:37:16Z

tragec/tasks/tasks.py

-        loader = DataLoader(
+        return DataLoader(
            dataset,
            num_workers=self.num_workers,
            collate_fn=dataset.collate_fn,
            batch_sampler=batch_sampler,
        )
-        return loader


Function BioDataModule._prep_loader refactored with the following changes:

Inline variable that is immediately returned (inline-immediately-returned-variable)

sourcery-ai · 2022-11-08T17:37:16Z

tragec/test/test_model.py

            strands = torch.ones(shape[:-1], dtype=torch.long)
        if lengths:
-            lengths = torch.ones(shape[:-1], dtype=torch.long) * lengths
+            lengths *= torch.ones(shape[:-1], dtype=torch.long)


Function TestGeCBertRaw.simpleForwardZeros refactored with the following changes:

Replace assignment with augmented assignment (aug-assign)

sourcery-ai · 2022-11-08T17:37:17Z

tragec/utils/_sampler.py

-            for batch in SubsetRandomSampler(
-                    list(BatchSampler(sorted_sampler, self.batch_size, self.drop_last))):
-                yield batch
+            yield from SubsetRandomSampler(
+                list(BatchSampler(sorted_sampler, self.batch_size, self.drop_last))
+            )


Function BucketBatchSampler.__iter__ refactored with the following changes:

Replace yield inside for loop with yield from (yield-from)

sourcery-ai · 2022-11-08T17:37:34Z

Sourcery Code Quality Report

✅ Merging this PR will increase code quality in the affected files by 0.19%.

Quality metrics	Before	After	Change
Complexity	2.93 ⭐	2.77 ⭐	-0.16 👍
Method Length	56.03 ⭐	55.70 ⭐	-0.33 👍
Working memory	6.78 🙂	6.79 🙂	0.01 👎
Quality	75.06% ⭐	75.25% ⭐	0.19% 👍

Other metrics	Before	After	Change
Lines	2927	2907	-20

Changed files	Quality Before	Quality After	Quality Change
setup.py	79.44% ⭐	79.98% ⭐	0.54% 👍
tragec/init.py	62.67% 🙂	62.53% 🙂	-0.14% 👎
tragec/datasets.py	77.35% ⭐	77.57% ⭐	0.22% 👍
tragec/registry.py	72.47% 🙂	77.30% ⭐	4.83% 👍
tragec/tokenizers.py	90.98% ⭐	91.00% ⭐	0.02% 👍
tragec/training.py	58.65% 🙂	58.77% 🙂	0.12% 👍
tragec/models/configuration.py	67.03% 🙂	64.87% 🙂	-2.16% 👎
tragec/models/modeling.py	64.33% 🙂	64.26% 🙂	-0.07% 👎
tragec/models/models_bert.py	93.01% ⭐	92.93% ⭐	-0.08% 👎
tragec/models/models_funnel.py	89.20% ⭐	90.61% ⭐	1.41% 👍
tragec/models/tape_model.py	44.17% 😞	43.16% 😞	-1.01% 👎
tragec/models/utils_t5.py	71.16% 🙂	71.74% 🙂	0.58% 👍
tragec/tasks/task_mlm.py	82.26% ⭐	82.28% ⭐	0.02% 👍
tragec/tasks/task_mrm.py	72.48% 🙂	72.42% 🙂	-0.06% 👎
tragec/tasks/task_multiclass.py	75.37% ⭐	75.42% ⭐	0.05% 👍
tragec/tasks/task_pairwisecontact.py	76.71% ⭐	76.72% ⭐	0.01% 👍
tragec/tasks/task_seq2seqclass.py	84.08% ⭐	83.79% ⭐	-0.29% 👎
tragec/tasks/task_singleclass.py	85.02% ⭐	84.91% ⭐	-0.11% 👎
tragec/tasks/tasks.py	81.54% ⭐	81.77% ⭐	0.23% 👍
tragec/test/test_model.py	88.10% ⭐	88.12% ⭐	0.02% 👍
tragec/utils/_sampler.py	85.54% ⭐	85.80% ⭐	0.26% 👍

Here are some functions in these files that still need a tune-up:

File	Function	Complexity	Length	Working Memory	Quality	Recommendation
tragec/models/tape_model.py	TAPEModelMixin.from_pretrained	29 😞	404 ⛔		16.84% ⛔	Refactor to reduce nesting. Try splitting into smaller methods
tragec/models/modeling.py	BioModel.configure_optimizers	15 🙂	243 ⛔	13 😞	36.38% 😞	Try splitting into smaller methods. Extract out complex expressions
tragec/models/configuration.py	BioConfig.__init__	1 ⭐	212 ⛔	36 ⛔	40.48% 😞	Try splitting into smaller methods. Extract out complex expressions
tragec/training.py	run_train	5 ⭐	195 😞	16 ⛔	44.83% 😞	Try splitting into smaller methods. Extract out complex expressions
tragec/training.py	process_trainer_kwargs	12 🙂	225 ⛔	8 🙂	49.45% 😞	Try splitting into smaller methods

Legend and Explanation

The emojis denote the absolute quality of the code:

⭐ excellent
🙂 good
😞 poor
⛔ very poor

The 👍 and 👎 indicate whether the quality has improved or gotten worse with this pull request.

Please see our documentation here for details on how these metrics are calculated.

We are actively working on this report - lots more documentation and extra metrics to come!

Help us improve this quality report!

'Refactored by Sourcery'

52876fa

sourcery-ai bot requested a review from jgoodson November 8, 2022 17:37

sourcery-ai bot commented Nov 8, 2022

View reviewed changes

Sourcery refactored master branch #18

Are you sure you want to change the base?

Sourcery refactored master branch #18

Uh oh!

Conversation

sourcery-ai bot commented Nov 8, 2022

Uh oh!

sourcery-ai bot Nov 8, 2022

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Nov 8, 2022

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Nov 8, 2022

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Nov 8, 2022

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Nov 8, 2022

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Nov 8, 2022

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Nov 8, 2022

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Nov 8, 2022

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Nov 8, 2022

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Nov 8, 2022

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Nov 8, 2022

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Nov 8, 2022

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Nov 8, 2022

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Nov 8, 2022

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Nov 8, 2022

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Nov 8, 2022

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Nov 8, 2022

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Nov 8, 2022

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Nov 8, 2022

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Nov 8, 2022

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot commented Nov 8, 2022

Sourcery Code Quality Report

Legend and Explanation

Uh oh!

Uh oh!