pg.geno.Deduping¶
Accessible via pg.geno.Deduping
.
- class Deduping(generator, hash_fn=None, auto_reward_fn=None, max_duplicates=1, max_proposal_attempts=100)[source]¶
Bases:
pg.DNAGenerator
Deduping generator.
A deduping generator can be applied on another generator to dedup its proposed DNAs.
Hash function
By default, the hash function is the symbolic hash of the DNA, which returns the same hash when the decisions from the DNA are the same.
For example:
pg.geno.Deduping(pg.geno.Random())
will only generate unique DNAs. When hash_fn is specified, it allows the user to compute the hash for a DNA.
For example:
pg.geno.Deduping(pg.geno.Random(), hash_fn=lambda dna: sum(dna.to_numbers()))
will dedup based on the sum of all decision values.
Number of duplicates
An optional max_duplicates can be provided by the user to allow a few duplicates.
For example:
pg.geno.Deduping(pg.geno.Random(), max_duplicates=5)
Note: for inner DNAGenerators that requires user feedback, duplication accounting is based on DNAs that are fed back to the DNAGenerator, not proposed ones.
Automatic reward computation
Automatic reward computation will be enabled when auto_reward_fn is provided AND when the inner generator takes feedback. It allows users to compute the reward for new duplicates (which exceed the max_duplicates limit) by aggregating rewards from previous duplicates. Such DNAs will be fed back to the DNAGenerator without client’s evaluation (supported by pg.sample through the ‘reward’ metadata).
For example:
pg.geno.Deduping(pg.evolution.regularized_evolution(), auto_reward_fn=lambda rs: sum(rs) / len(rs))
Attributes:
Returns True if the DNAGenerator needs feedback.