pg.geno.Deduping

Accessible via pg.geno.Deduping.

class Deduping(generator, hash_fn=None, auto_reward_fn=None, max_duplicates=1, max_proposal_attempts=100)[source]

Bases: pg.DNAGenerator

Deduping generator.

A deduping generator can be applied on another generator to dedup its proposed DNAs.

Hash function

By default, the hash function is the symbolic hash of the DNA, which returns the same hash when the decisions from the DNA are the same.

For example:

pg.geno.Deduping(pg.geno.Random())

will only generate unique DNAs. When hash_fn is specified, it allows the user to compute the hash for a DNA.

For example:

pg.geno.Deduping(pg.geno.Random(),
                 hash_fn=lambda dna: sum(dna.to_numbers()))

will dedup based on the sum of all decision values.

Number of duplicates

An optional max_duplicates can be provided by the user to allow a few duplicates.

For example:

pg.geno.Deduping(pg.geno.Random(), max_duplicates=5)

Note: for inner DNAGenerators that requires user feedback, duplication accounting is based on DNAs that are fed back to the DNAGenerator, not proposed ones.

Automatic reward computation

Automatic reward computation will be enabled when auto_reward_fn is provided AND when the inner generator takes feedback. It allows users to compute the reward for new duplicates (which exceed the max_duplicates limit) by aggregating rewards from previous duplicates. Such DNAs will be fed back to the DNAGenerator without client’s evaluation (supported by pg.sample through the ‘reward’ metadata).

For example:

pg.geno.Deduping(pg.evolution.regularized_evolution(),
                 auto_reward_fn=lambda rs: sum(rs) / len(rs))

Attributes:

needs_feedback

Returns True if the DNAGenerator needs feedback.

property needs_feedback: bool[source]

Returns True if the DNAGenerator needs feedback.