Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trim does not trim trailing whitespace #133

Closed
nbauernfeind opened this issue Aug 8, 2023 · 2 comments
Closed

Trim does not trim trailing whitespace #133

nbauernfeind opened this issue Aug 8, 2023 · 2 comments
Assignees
Labels
bug Something isn't working

Comments

@nbauernfeind
Copy link
Member

Description

Trailing white space remains even when Trim is set to true.

Steps to reproduce

Load the attached csv into a dh-core repl:

from deephaven import read_csv
t = read_csv("test.csv", trim=True, delimiter=",")

Expected results

I'd expect the Copy Cell Unformatted (right-click dropdown menu on a table cell in the deephaven-core web-ui) to demonstrate that all cells are properly trimmed.

Actual results

The 2nd and 3rd rows for all three columns have trailing whitespace.

test.csv

@nbauernfeind nbauernfeind added the bug Something isn't working label Aug 8, 2023
@nbauernfeind
Copy link
Member Author

nbauernfeind commented Aug 8, 2023

@kosak it looks like we need to add trim support to processUnquotedMode.

    public boolean grabNext(final ByteSlice dest, final MutableBoolean lastInRow)
            throws CsvReaderException {
        spillBuffer.clear();
        startOffset = offset;

        if (ignoreSurroundingSpaces) {
            skipWhitespace();
        }
        if (!tryEnsureMore()) {
            return false;
        }

        // Is first char the quote char?
        if (buffer[offset] == quoteChar) {
            ++offset;
            processQuotedMode(dest, lastInRow);
            if (trim) {
                trimWhitespace(dest);
            }
        } else {
            processUnquotedMode(dest, lastInRow);
        }
        return true;
    }

It seem that this ignores the leading whitespace:

        if (ignoreSurroundingSpaces) {
            skipWhitespace();
        }

However, in processUnquotedMode(dest, lastInRow); we have no code path to drop the trailing whitespace.

Perhaps the intended solution was for skipWhitespace(); to also remove trailing spaces?

@kosak
Copy link
Contributor

kosak commented Aug 14, 2023

Hi, thanks for the bug report. The issue has been fixed in Deephaven Core 0.27.1 and Deephaven CSV 0.11.0

@kosak kosak closed this as completed Aug 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants