Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data Mapping benchmark #14

Open
jzabroski opened this issue Jan 20, 2021 · 2 comments
Open

Data Mapping benchmark #14

jzabroski opened this issue Jan 20, 2021 · 2 comments

Comments

@jzabroski
Copy link

See #5 discussion - ML.NET doesn't have low-level API, so some Csv readers need a higher-level benchmark in order to participate.

@JoshClose
Copy link
Contributor

I started implementing file helpers and found that it only outputs class objects. There is no way to get a string field out of each row.

@joelverhagen If you want to figure out how you would like to structure reading records, I could go in and add CsvHelper and FileHelpers to it (at the very least). Just need the template in source like you have for parsing.

@jzabroski
Copy link
Author

Another thing I use a ton of with CsvHelper is this pattern where I log the row number in the CSV when importing it:

using System.Diagnostics.CodeAnalysis;
using CsvHelper.Configuration;
namespace Infra.Chat.Services.Importers.Models
{
    public class ChatLineMap : ClassMap<ChatLine>
    {
        [SuppressMessage("ReSharper", "VirtualMemberCallInConstructor")]
        public ChatLineMap()
        {
            Map(x => x.Id).Name("Id");
            Map(x => x.GroupId).Name("GroupId");
            Map(x => x.UserName).Name("UserName").TypeConverter<StringTypeConverter>();
            Map(x => x.Message).Name("Message").TypeConverter<StringTypeConverter>();
            Map(x => x.LastModified).Name("LastModified");
            Map(x => x.LineNumber).ConvertUsing(x => x.Context.Row); // Any high-level benchmark ideally has this as a thing
        }
    }
}

I realize this is specific but its a feature that makes a ton of sense. It would allow end-users to back out how the parser screwed up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants