Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
harness: Detector only #833
base: main
Are you sure you want to change the base?
harness: Detector only #833
Changes from 16 commits
d9e7f28
c8b7e77
13c89fe
b87068f
df107c2
d9d43a6
662fbf0
15c1097
3f8e263
4714757
9f63ab1
1b5aa46
239cfc8
65c0c36
e9eb742
a35cb5a
eab5c67
c3d33d4
153ff2e
5086e80
6526194
a7b3e1f
dbe916a
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-- out of scope for here, but we should implement serialization/deserialization for
Attempt
sThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this skip the attempt constructor? Can we add an explicit type signature to signal what
cls
is expected to be?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cls
is the callable for theclass
which will be anAttempt
. This will call the__init__()
method with all defaults.Due to the current overrides in the class
attempt_obj.outputs
below may not produce the same in memory object for a multi-turn conversation attempt since the existingas_dict()
method serializedoutputs
into the log and not the full messages history.For the purposes of this PR I suspect this is acceptable, however it is worth noting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you just run me through the reasoning here? would this clobber harness config loaded from global / site / cli-specific config YAML?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect this could be avoided. We should not clobber
config
without an explicit override from a command line flag.If no specific harness was provided via
config
and no detectors were provided perparsed_spec
that is the determining factor on which default harness to load when_config.plugins.harnesses
does not contain any configuration data for a specific harness. This does expose that there may be a missing top level parameter to select a specific harness if default config were to provide for various harness types. Currently, finding config forDetectorOnly
to be a selection criteria seems a bit brittle.If there is a desire to consolidate harness selection I maybe something like:
There is probably more abstraction possible here but this offers an idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the refactor for harness selection offered is not used, this needs to be removed as
start_run()
was called before entering this conditional.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jmartin-tech What do you think this is telling us about where responsibility for orchestrating runs lies? Is the existing
Harness
interface just too inflexible to make invoking novel things likeDetectorOnly
fromgarak.cli
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not quite what I was thinking in terms of obtaining the detectors from the original log. The actual extractions looks good as the
start_run
will contain the expandeddetector
list however it ignores existing top level arguments and the spec parsing support for options set on the harness.The harness could accept the list of detectors provided via the parsed_spec for detectors from the command line and an
evaluator
as other harnesses do, if no detectors were provided then the list of detectors can be obtained based on config from the original report.If I am reading this correctly, this is expecting
detectors
andeval_threshold
to be set in the harness config and falling back if not found, this would not account for the top level command line options that as a user I would expect to be applied.It looks like the current expectation would be a config like:
h_options.json:
With a command line like:
However based on the existing options a user may have expectations for
-d all
to apply all detectors when passed as an option.h_options.json:
With a command line like:
My thought here is that the
garak.command
module should not need access to data from the cli but should be be provided information that takes advantage of cli having parsed all the setup options and it's support for things like expanding a detector classes based on a module name.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, so I will add the config and then support only
--detectors
/-d
as a top-level argument, and if that is not present, fall back to the ones present within the report. It makes more sense from a user perspective 👍There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vidushiMaheshwari, circling back to check on progress.
I am happy to monitor this PR or offer parts of what I suggested as a PR to your branch in the coming weeks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I apologize for being inactive, just pushed the changes which I believe should suffice the comments. Would appreciate a PR with suggested changes!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can this work?