Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SNOW-1023077: JDBC driver package is too big #1622

Closed
NathanEckert opened this issue Jan 30, 2024 · 10 comments
Closed

SNOW-1023077: JDBC driver package is too big #1622

NathanEckert opened this issue Jan 30, 2024 · 10 comments
Assignees
Labels
enhancement The issue is a request for improvement or a new feature status-triage_done Initial triage done, will be further handled by the driver team

Comments

@NathanEckert
Copy link

Snowflake account identifier: FZLBQHX.CW96174 (See ticket 00686664 for more data and context)

Hello,
We are having issues with the size of the jars, despite the use of the new experimental thin-jar #1554 (comment).
Indeed, we are packaging our application to PyPi , and there is a default size limit of 60 MB to avoid any abuse.

We are having some issues with the Snowflake jar sizes:

  • 3.13.30 (last one we were able to package):
    • 45 MB total (including our Snowflake specific code)
    • 31 MB from snowflake-jdbc-3.13.30.jar alone
  • 3.14.5:
    • 78 MB total (including our Snowflake specific code)
    • 67 MB from snowflake-jdbc-3.14.5 alone
  • 3.14.5-thin:
    • 73 MB total (including our Snowflake specific code)
    • 5 MB from snowflake-jdbc-3.14.5-thin (this does not include non-shaded dependencies)

The size reduction of the thin jar is only apparent, as the driver itself is still 65 MB. You can see it when creating a dummy project with the following pom.xml:

cat pom.xml
<project>
  <modelVersion>4.0.0</modelVersion>
  <groupId>not-used</groupId>
  <artifactId>fat</artifactId>
  <version>standalone</version>

  <dependencies>
    <dependency>
      <groupId>net.snowflake</groupId>
      <artifactId>snowflake-jdbc-thin</artifactId>
      <version>LATEST</version>
    </dependency>
  </dependencies>
  <build>
    <plugins>
      <plugin>
        <artifactId>maven-shade-plugin</artifactId>
        <executions>
          <execution>
           <phase>package</phase>
           <goals>
              <goal>shade</goal>
           </goals>
          </execution>
        </executions>
      </plugin>
    </plugins>
  </build>
</project>

and then run mvn package and du -h target/fat-standalone.jar
Running the same with version 3.13.30 yields 32 MB only!

Side note: we encounter the issue on PyPi, but there are size constraints in other places, such as (based on the issues filed on the Snowflake repository):

This issue prevents us from picking the latest version for bug fixes and new features

@github-actions github-actions bot changed the title JDBC driver package is too big SNOW-1023077: JDBC driver package is too big Jan 30, 2024
@sfc-gh-dszmolka sfc-gh-dszmolka added status-triage Issue is under initial triage and removed bug labels Jan 31, 2024
@sfc-gh-dszmolka sfc-gh-dszmolka self-assigned this Jan 31, 2024
@sfc-gh-dszmolka
Copy link
Contributor

hi and thank you for raising this issue - also for sharing the support case #id. we're taking a look.

@sfc-gh-dszmolka
Copy link
Contributor

the official support case has been closed, but there's some which we would like to share here in case it helps someone else too.

The thin jar for the driver is still in experimental phase, and will be there for a while. At the same time, we thank you and everyone who adopted it so early and provided their feedback ! Based on those feedbacks and on existing plans, the development and optimization of the thin jar is still ongoing.

Something which we already confirmed is not applicable for your case, but might be applicable for someone else while the development efforts are ongoing - you might be able to achieve smaller artifact size by excluding unused dependencies.
Example with thin jar with excluded GCS and AZURE:

<dependency>
      <groupId>net.snowflake</groupId>
      <artifactId>snowflake-jdbc-thin</artifactId>
      <version>3.14.5</version>
      <exclusions>
        <exclusion>
          <groupId>com.google.api</groupId>
          <artifactId>gax</artifactId>
        </exclusion>
        <exclusion>
          <groupId>com.google.cloud</groupId>
          <artifactId>google-cloud-core</artifactId>
        </exclusion>
        <exclusion>
          <groupId>com.google.cloud</groupId>
          <artifactId>google-cloud-storage</artifactId>
        </exclusion>
        <exclusion>
          <groupId>com.microsoft.azure</groupId>
          <artifactId>azure-storage</artifactId>
        </exclusion>
        <!--        <exclusion>-->
        <!--          <groupId>com.amazonaws</groupId>-->
        <!--          <artifactId>*</artifactId>-->
        <!--        </exclusion>-->
      </exclusions>
    </dependency>

We'll keep this thread posted and you can also watch the official release notes for the JDBC driver for updates.

@sfc-gh-dszmolka sfc-gh-dszmolka added status-triage_done Initial triage done, will be further handled by the driver team enhancement The issue is a request for improvement or a new feature and removed status-triage Issue is under initial triage labels Feb 13, 2024
@leo7
Copy link

leo7 commented Jun 24, 2024

Thanks @sfc-gh-dszmolka , do we have any update for the driver size? We recently have an issue caused by the driver size as well. If not, we plan to use older version to reduce the package size.

@sfc-gh-dszmolka
Copy link
Contributor

hi - no specific update besides the above; which can be used to potentially reduce the size of the artifact in the thin jar.
besides that, perhaps #1710 also worth following

@leo7
Copy link

leo7 commented Jun 26, 2024

Thanks for your comment @sfc-gh-dszmolka , really appreciated! We tried thin jar and will work around by excluding certain dependencies from the jar file.

@NathanEckert
Copy link
Author

On a side note, why did the size of the JAR more than doubled between 3.13 and 3.14 ?

@jossmoff
Copy link

jossmoff commented Nov 19, 2024

Hey team 👋🏻 I've recently had an issue with updating to the latest version of jdbc-driver (v3.2.0) because of how large the jar is, and it exceeding the lambda package deployment size. I want to use the thin jar, but according to the release notes it is experimental. Is there a reason it's still experimental?

@sfc-gh-dszmolka
Copy link
Contributor

hey - we have plans to make it non-experimental by the end of this year - January 2025. Until then, you're more than free to test the thin jar in your project; and see how it works out for you.

@jossmoff
Copy link

jossmoff commented Nov 19, 2024

hey - we have plans to make it non-experimental by the end of this year - January 2025. Until then, you're more than free to test the thin jar in your project; and see how it works out for you.

Thanks for the prompt reply @sfc-gh-dszmolka, will give it a go 👍🏻

@sfc-gh-dszmolka
Copy link
Contributor

the improvement has been implmented couple months ago with snowflake-jdbc-thin; which will be also GA , according to plans, latest by January 2025

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement The issue is a request for improvement or a new feature status-triage_done Initial triage done, will be further handled by the driver team
Projects
None yet
Development

No branches or pull requests

5 participants