Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Added dataset parameters to dataflow debug settings and added tests #228

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

ClementVaillantCodit
Copy link
Contributor

  • Added dataset parameters to dataflow debug settings
    • For information, debugSettings.DatasetParameters is of type BinaryData. Data is basically a JSON representation of a Dictionary<string, Dictionary<string, object>>, where the high level dictionary contains the keys of the dataset names, and the low level dictionary contains the actual dataset parameter keys and values
  • Added tests, @stijnmoreels since DataSetParameters property is internal, not sure how I could assert that if a dataset parameter is added to a not empty dictionary of parameters, it is added to the dictionary and not overridden. Any suggestions for this please? Should I expose the RunDataFlowOptions.DataSetParameters via a method?

Copy link

netlify bot commented Nov 21, 2024

Deploy Preview for arcus-testing canceled.

Name Link
🔨 Latest commit a2b4aa4
🔍 Latest deploy log https://app.netlify.com/sites/arcus-testing/deploys/674740b4efbc0f0007bad6f6

@stijnmoreels stijnmoreels changed the title Added dataset parameters to dataflow debug settings and added tests feat: Added dataset parameters to dataflow debug settings and added tests Nov 21, 2024
Copy link
Member

@stijnmoreels stijnmoreels left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great that you found the time to work on this a bit!
Besides these comments and related to your question on 'how to test this', I would not make the DataSetParameters public as asserting on the existence of those parameters in that dictionary does not garantuee that the dataset parameter was 'correctly added'. That can only be determined if those parameters are actually used with a real dataset.

Is it an idea to make the container name of the datasets in the TemporaryDataFactoryDataFlow test fixture configurable with dataset parameters? Or some other metadata stuff (can be anything). The same test fixture can then expose a method to retrieve back that metadata so that the test can assert that indeed adding dataset parameters result in a changed dataset.

So:

  • Add integration test with at least three dataset parameters to verify that multiple parameters for the same dataset and for different datasets can be added correctly.
  • Update docs\preview\02-Features\06-Integration/01-data-factory.mdx with the new options.AddDataSetParameter(...) method.

…ugSession.cs

Co-authored-by: Stijn Moreels <9039753+stijnmoreels@users.noreply.github.com>
…ugSession.cs

Co-authored-by: Stijn Moreels <9039753+stijnmoreels@users.noreply.github.com>
- Reorganized using directives in TemporaryDataFlowDebugSession.cs.
- Modified RunDataFlowOptions to change DataSetParameters type, add null checks, and handle nested dictionaries.
- Enhanced TemporaryDataFactoryDataFlow with new properties, methods, and dataset parameter handling.
- Updated RunDataFlowTests with a new test for dataset parameters and removed obsolete tests.
@ClementVaillantCodit
Copy link
Contributor Author

@stijnmoreels I'd appreciate a fresh review on the newly pushed changes please. Added new methods specific to run the dataflow with dataflow params, really unsure if that's a good way to go.
Looking forward to read your comments!

@arcus-automation
Copy link
Collaborator

arcus-automation commented Nov 27, 2024

🧪 Code coverage summary

Metric Value
Line coverage 🟢 91.6%
Branch coverage 🟢 81%

Great job! 😎 Your code coverage is higher than my caffeine levels! ☕

/// <exception cref="ArgumentException">Thrown when the <paramref name="datasetName"/> is blank.</exception>
/// <exception cref="ArgumentException">Thrown when the <paramref name="parameterName"/> is blank.</exception>
/// <exception cref="ArgumentNullException">Thrown when the <paramref name="parameterValue"/> is null.</exception>
public RunDataFlowOptions AddDataSetParameter(string datasetName, string parameterName, object parameterValue)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stijnmoreels datasetName param name does not really reflects what it is, since this parameter represents either the source or sink name.
Shall we leave this name or do you have a suggestion for a better naming please?
sourceOrSinkName maybe?

/// <summary>
/// Creates a DataFlow with a CSV source and sink on an Azure DataFactory resource.
/// </summary>
public static async Task<TemporaryDataFactoryDataFlow> CreateWithCsvSinkSourceAndDataSetParametersAsync(TestConfig config, ILogger logger, Action<AssertCsvOptions> configureOptions)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stijnmoreels added custom methods to add source and sink with parameters... Is this OK?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Integration.DataFactory] Add support to set Dataset parameters
3 participants