Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom Column Renaming #50

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Custom Column Renaming #50

wants to merge 2 commits into from

Conversation

Rhymond
Copy link

@Rhymond Rhymond commented Oct 25, 2022

  • This PR fixes the bug when column name has multiple same word initialism with different cases (e.g. Id/id jSon/json)
  • Adds ability to specify custom column rename rules using command line flag -ccr (custom column rename). It ignores any other formatting rules. Very useful when you have column for example: providerId, the usual process would rename it to ProvIDerId, with custom rule defined (providerId:ProviderID) you can rename to expected ProviderID.

@Rhymond Rhymond changed the title Column renaming Custom Column Renaming Oct 25, 2022
@fraenky8
Copy link
Owner

Thanks for opening this PR, will take a look at it upcoming weekend and/or week!

Copy link
Owner

@fraenky8 fraenky8 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sry, finally found some time and headspace! Left some notes 🙇‍♂️

Comment on lines +277 to +278
m := regexp.MustCompile(`(?i)` + substr)
s = string(m.ReplaceAll([]byte(s), []byte(substr)))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually seeing a regex in a loop is a huge performance penalty - but since this is not a real-time API and "just" a CLI tool, lets leave it for now, we can improve anytime if we encounter real performance issues.

One change request tho, lets use s = m.ReplaceAllString(s, substr) - this way we can get rid of all the back and forth casting between bytes and strings.

desc: "no rules",
input: "",
expected: nil,
},
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets add a test case without separator (invalid rule:value pair).

Comment on lines +320 to +321
const multiSplitter = ","
const ruleSplitter = ":"
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like but can also hard-code where it's used. If you decide to keep them here, what do you think about to rename and grouping these like

const (
	rulesSeparator = ","
	ruleSeparator  = ":"
)


// ParseCustomRenameRules reads a custom rename setting string
// and returns a map for easier rule access.
// customRenameRule example: column1:Column2,col10:column10.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets be super explicit about this and mention the format first and then an example?

Format: from:to
Example: providerId:providerID
Multiple rules can be specified by comma: column1:Column2,col10:column10.

Comment on lines +332 to +333
m := make(map[string]string, strings.Count(settings.CustomColumnRename, multiSplitter)+1)
for _, rule := range strings.Split(settings.CustomColumnRename, multiSplitter) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still thinking about runtime performance here: lets split once and use the length of the rules:

rules := strings.Split(settings.CustomColumnRename, rulesSeparator)
m := make(map[string]string, len(rules))
for ...

@@ -49,6 +49,7 @@ func NewCmdArgs() (args *CmdArgs) {
flag.Var(&args.Null, "null", "representation of NULL columns: sql.Null* (sql) or primitive pointers (native|primitive)")

flag.BoolVar(&args.NoInitialism, "no-initialism", args.NoInitialism, "disable the conversion to upper-case words in column names")
flag.StringVar(&args.CustomColumnRename, "ccr", args.CustomColumnRename, "add custom colum rename rules and ignore column renaming functions, skipping initialism and casing (format: providerId:ProviderID,123:abc)")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same thought as in the comment of the godoc of the parsing function, lets be super explicit here, since this is the helper/usage output for users of this tool. Maybe something along the lines

use custom-column-rules to overwrite column naming and initialism. Format ...

Can you double check the typo colum -> column?

@@ -308,19 +300,37 @@ func replaceSpace(r rune) rune {
return r
}

// renameColumn changes the column name if the change rule
// in settings applied, otherwise returns original string and false.
func renameColumn(cfg *settings.Settings, s string) (string, bool) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming of arg: s -> column

// renameColumn changes the column name if the change rule
// in settings applied, otherwise returns original string and false.
func renameColumn(cfg *settings.Settings, s string) (string, bool) {
m := cfg.ParseCustomRenameRules()
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming: m -> rules

@fraenky8
Copy link
Owner

Hey @Rhymond - is this feature still wanted? Do you think you can find some time in the next couple of weeks to address my comments? Thanks a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants