Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom Column Renaming #50

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -191,6 +191,8 @@ Print usage with `-?` or `-help`
```
Usage of tables-to-go:
-? shows help and usage
-ccr string
add custom colum rename rules and ignore column renaming functions, skipping initialism and casing (format: providerId:ProviderID,123:abc)
-d string
database name (default "postgres")
-f force; skip tables that encounter errors
Expand Down
50 changes: 30 additions & 20 deletions internal/cli/tables-to-go-cli.go
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ package cli

import (
"fmt"
"regexp"
"strings"
"unicode"

Expand Down Expand Up @@ -273,21 +274,12 @@ func getNullType(settings *settings.Settings, primitive string, sql string) stri

func toInitialisms(s string) string {
for _, substr := range initialisms {
idx := indexCaseInsensitive(s, substr)
if idx == -1 {
continue
}
toReplace := s[idx : idx+len(substr)]
s = strings.ReplaceAll(s, toReplace, substr)
m := regexp.MustCompile(`(?i)` + substr)
s = string(m.ReplaceAll([]byte(s), []byte(substr)))
Comment on lines +277 to +278
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually seeing a regex in a loop is a huge performance penalty - but since this is not a real-time API and "just" a CLI tool, lets leave it for now, we can improve anytime if we encounter real performance issues.

One change request tho, lets use s = m.ReplaceAllString(s, substr) - this way we can get rid of all the back and forth casting between bytes and strings.

}
return s
}

func indexCaseInsensitive(s, substr string) int {
s, substr = strings.ToLower(s), strings.ToLower(substr)
return strings.Index(s, substr)
}

// ValidVariableName checks for the existence of any characters
// outside of Unicode letters, numbers and underscore.
func validVariableName(s string) bool {
Expand All @@ -308,19 +300,37 @@ func replaceSpace(r rune) rune {
return r
}

// renameColumn changes the column name if the change rule
// in settings applied, otherwise returns original string and false.
func renameColumn(cfg *settings.Settings, s string) (string, bool) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming of arg: s -> column

m := cfg.ParseCustomRenameRules()
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming: m -> rules

r, ok := m[s]
if !ok {
return s, false
}

return r, true
}

// FormatColumnName checks for invalid characters and transforms a column name
// according to the provided settings.
func formatColumnName(settings *settings.Settings, column, table string) (string, error) {

// Replace any whitespace with underscores
columnName := strings.Map(replaceSpace, column)
columnName = caser.String(columnName)

if settings.IsOutputFormatCamelCase() {
columnName = camelCaseString(columnName)
var renamed bool
var columnName string
if settings.HasCustomRename() {
columnName, renamed = renameColumn(settings, column)
}
if settings.ShouldInitialism() {
columnName = toInitialisms(columnName)

if !renamed {
// Replace any whitespace with underscores
columnName = strings.Map(replaceSpace, column)
columnName = caser.String(columnName)
if settings.IsOutputFormatCamelCase() {
columnName = camelCaseString(columnName)
}
if settings.ShouldInitialism() {
columnName = toInitialisms(columnName)
}
}

// Check that the column name doesn't contain any invalid characters for Go variables
Expand Down
47 changes: 47 additions & 0 deletions internal/cli/tables-to-go-cli_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,42 @@ func TestCamelCaseString(t *testing.T) {
}
}

func TestCustomRenameRuleParser(t *testing.T) {
tests := []struct {
desc string
input string
expected map[string]string
}{
{
desc: "no rules",
input: "",
expected: nil,
},
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets add a test case without separator (invalid rule:value pair).

{
desc: "single rule",
input: "rule1:Rule1",
expected: map[string]string{
"rule1": "Rule1",
},
},
{
desc: "multiple rules",
input: "rule1:Rule1,hello:world",
expected: map[string]string{
"rule1": "Rule1",
"hello": "world",
},
},
}
for _, tt := range tests {
t.Run(tt.desc, func(t *testing.T) {
s := settings.New()
s.CustomColumnRename = tt.input
actual := s.ParseCustomRenameRules()
assert.Equal(t, tt.expected, actual, "test case input: "+tt.input)
})
}
}
func TestToInitialisms(t *testing.T) {
tests := []struct {
desc string
Expand Down Expand Up @@ -135,6 +171,11 @@ func TestToInitialisms(t *testing.T) {
input: "IdjsonuRlHtTp",
expected: "IDJSONURLHTTP",
},
{
desc: "different cases of initialism (id/Id or json/jSon)",
input: "providerId",
expected: "provIDerID",
},
}
for _, tt := range tests {
t.Run(tt.desc, func(t *testing.T) {
Expand Down Expand Up @@ -1991,10 +2032,15 @@ func TestFormatColumnName(t *testing.T) {
{"numbersOnly", "123", "X_123", "X123"},
{"nonEnglish", "火", "火", "火"},
{"nonEnglishUpper", "Λλ", "Λλ", "Λλ"},
{"customRenameRule", "12345", "abc", "abc"},
{"customRenameRuleWithInitialism", "providerId", "ProviderID", "ProviderID"},
}

const customRenameRules = "12345:abc,providerId:ProviderID"

camelSettings := settings.New()
camelSettings.OutputFormat = settings.OutputFormatCamelCase
camelSettings.CustomColumnRename = customRenameRules
t.Run("camelcase", func(t *testing.T) {
for _, tc := range tests {
t.Run(tc.name, func(t *testing.T) {
Expand All @@ -2010,6 +2056,7 @@ func TestFormatColumnName(t *testing.T) {

originalSettings := settings.New()
originalSettings.OutputFormat = settings.OutputFormatOriginal
originalSettings.CustomColumnRename = customRenameRules
t.Run("original", func(t *testing.T) {
for _, tc := range tests {
t.Run(tc.name, func(t *testing.T) {
Expand Down
38 changes: 36 additions & 2 deletions pkg/settings/settings.go
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ import (
"fmt"
"os"
"path/filepath"
"strings"
)

// DBType represents a type of a database.
Expand Down Expand Up @@ -176,7 +177,8 @@ type Settings struct {
Suffix string
Null NullType

NoInitialism bool
NoInitialism bool
CustomColumnRename string

TagsNoDb bool

Expand Down Expand Up @@ -217,7 +219,8 @@ func New() *Settings {
Suffix: "",
Null: NullTypeSQL,

NoInitialism: false,
NoInitialism: false,
CustomColumnRename: "",

TagsNoDb: false,

Expand Down Expand Up @@ -308,6 +311,37 @@ func (settings *Settings) ShouldInitialism() bool {
return !settings.NoInitialism
}

// HasCustomRename returns if the setting to rename columns with
// custom rename rules is set.
func (settings *Settings) HasCustomRename() bool {
return settings.CustomColumnRename != ""
}

const multiSplitter = ","
const ruleSplitter = ":"
Comment on lines +320 to +321
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like but can also hard-code where it's used. If you decide to keep them here, what do you think about to rename and grouping these like

const (
	rulesSeparator = ","
	ruleSeparator  = ":"
)


// ParseCustomRenameRules reads a custom rename setting string
// and returns a map for easier rule access.
// customRenameRule example: column1:Column2,col10:column10.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets be super explicit about this and mention the format first and then an example?

Format: from:to
Example: providerId:providerID
Multiple rules can be specified by comma: column1:Column2,col10:column10.

// multiple rules are split with comma
func (settings *Settings) ParseCustomRenameRules() map[string]string {
if !settings.HasCustomRename() {
return nil
}

m := make(map[string]string, strings.Count(settings.CustomColumnRename, multiSplitter)+1)
for _, rule := range strings.Split(settings.CustomColumnRename, multiSplitter) {
Comment on lines +332 to +333
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still thinking about runtime performance here: lets split once and use the length of the rules:

rules := strings.Split(settings.CustomColumnRename, rulesSeparator)
m := make(map[string]string, len(rules))
for ...

r := strings.Split(rule, ruleSplitter)
if len(r) != 2 {
continue
}

m[r[0]] = r[1]
}

return m
}

// IsOutputFormatCamelCase returns if the type given by command line args is of
// camel-case format.
func (settings *Settings) IsOutputFormatCamelCase() bool {
Expand Down
1 change: 1 addition & 0 deletions tables-to-go.go
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,7 @@ func NewCmdArgs() (args *CmdArgs) {
flag.Var(&args.Null, "null", "representation of NULL columns: sql.Null* (sql) or primitive pointers (native|primitive)")

flag.BoolVar(&args.NoInitialism, "no-initialism", args.NoInitialism, "disable the conversion to upper-case words in column names")
flag.StringVar(&args.CustomColumnRename, "ccr", args.CustomColumnRename, "add custom colum rename rules and ignore column renaming functions, skipping initialism and casing (format: providerId:ProviderID,123:abc)")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same thought as in the comment of the godoc of the parsing function, lets be super explicit here, since this is the helper/usage output for users of this tool. Maybe something along the lines

use custom-column-rules to overwrite column naming and initialism. Format ...

Can you double check the typo colum -> column?


flag.BoolVar(&args.TagsNoDb, "tags-no-db", args.TagsNoDb, "do not create db-tags")

Expand Down