-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improving Self-Hosting and Removing 3rd Party dependencies. #4465
Open
Podginator
wants to merge
27
commits into
omnivore-app:main
Choose a base branch
from
Podginator:self-host-updates
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+11,779
−5,961
Open
Changes from 18 commits
Commits
Show all changes
27 commits
Select commit
Hold shift + click to select a range
60303ff
Self-Hosting Changes
Podginator bb7b3c9
Fix Minio Environment Variable
Podginator 593bac0
Just make pdfs successful, due to lack of PDFHandler
Podginator d4710a8
Fix issue where flag was set wrong
Podginator 26c5ef3
Added an NGINX Example file
Podginator 4607032
Add some documentation for self-hosting via Docker Compose
Podginator ae66e2e
Make some adjustments to Puppeteer due to failing sites.
Podginator b350fbd
adjust timings
Podginator 322ec68
Add start of Mail Service
Podginator 6f1ee6b
Fix Docker Files
Podginator 222ba06
More email service stuff
Podginator 34e039e
Add Guide to use Zapier for Email-Importing.
Podginator 8b845b5
Ensure that if no env is provided it uses the old email settings
Podginator e557fd0
Add some instructions for self-hosted email
Podginator b8226db
Add SNS Endpoints for Mail Watcher
Podginator af70b25
Add steps and functionality for using SES and SNS for email
Podginator 2e3134c
Uncomment a few jobs.
Podginator ab51fc9
Added option for Firefox for parser. Was having issues with Chromium …
Podginator 0e6c675
Add missing space.
Podginator 6b7f170
Fix some wording on the Guide
Podginator 9d41cc5
Fix Package
Podginator a66f92b
Fix MV
Podginator c27af01
Do raw handlers for Medium
Podginator 7bebb45
Fix images in Medium
Podginator 7bdf222
Update self-hosting/GUIDE.md
Podginator d42656b
Update Guide with other variables
Podginator 685f542
Merge
Podginator File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,61 @@ | ||
FROM node:18.16 as builder | ||
|
||
WORKDIR /app | ||
|
||
ENV PUPPETEER_SKIP_CHROMIUM_DOWNLOAD true | ||
RUN apt-get update && apt-get install -y g++ make python3 | ||
|
||
COPY package.json . | ||
COPY yarn.lock . | ||
COPY tsconfig.json . | ||
COPY .prettierrc . | ||
COPY .eslintrc . | ||
|
||
COPY /packages/readabilityjs/package.json ./packages/readabilityjs/package.json | ||
COPY /packages/api/package.json ./packages/api/package.json | ||
COPY /packages/text-to-speech/package.json ./packages/text-to-speech/package.json | ||
COPY /packages/content-handler/package.json ./packages/content-handler/package.json | ||
COPY /packages/liqe/package.json ./packages/liqe/package.json | ||
COPY /packages/utils/package.json ./packages/utils/package.json | ||
|
||
RUN yarn install --pure-lockfile | ||
|
||
ADD /packages/readabilityjs ./packages/readabilityjs | ||
ADD /packages/api ./packages/api | ||
ADD /packages/text-to-speech ./packages/text-to-speech | ||
ADD /packages/content-handler ./packages/content-handler | ||
ADD /packages/liqe ./packages/liqe | ||
ADD /packages/utils ./packages/utils | ||
|
||
RUN yarn workspace @omnivore/utils build | ||
RUN yarn workspace @omnivore/text-to-speech-handler build | ||
RUN yarn workspace @omnivore/content-handler build | ||
RUN yarn workspace @omnivore/liqe build | ||
RUN yarn workspace @omnivore/api build | ||
|
||
# After building, fetch the production dependencies | ||
RUN rm -rf /app/packages/api/node_modules | ||
RUN rm -rf /app/node_modules | ||
RUN yarn install --pure-lockfile --production | ||
|
||
FROM node:18.16 as runner | ||
LABEL org.opencontainers.image.source="https://github.com/omnivore-app/omnivore" | ||
|
||
RUN apt-get update && apt-get install -y netcat-openbsd | ||
|
||
WORKDIR /app | ||
|
||
ENV NODE_ENV production | ||
|
||
COPY --from=builder /app/packages/api/dist /app/packages/api/dist | ||
COPY --from=builder /app/packages/readabilityjs/ /app/packages/readabilityjs/ | ||
COPY --from=builder /app/packages/api/package.json /app/packages/api/package.json | ||
COPY --from=builder /app/packages/api/node_modules /app/packages/api/node_modules | ||
COPY --from=builder /app/node_modules /app/node_modules | ||
COPY --from=builder /app/package.json /app/package.json | ||
COPY --from=builder /app/packages/text-to-speech/ /app/packages/text-to-speech/ | ||
COPY --from=builder /app/packages/content-handler/ /app/packages/content-handler/ | ||
COPY --from=builder /app/packages/liqe/ /app/packages/liqe/ | ||
COPY --from=builder /app/packages/utils/ /app/packages/utils/ | ||
|
||
CMD ["yarn", "workspace", "@omnivore/api", "start_queue_processor"] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,71 @@ | ||
import { SignedUrlParameters, StorageClient, File } from './StorageClient' | ||
import { Storage, File as GCSFile } from '@google-cloud/storage' | ||
|
||
export class GcsStorageClient implements StorageClient { | ||
private storage: Storage | ||
|
||
constructor(keyFilename: string | undefined) { | ||
this.storage = new Storage({ | ||
keyFilename, | ||
}) | ||
} | ||
|
||
private convertFileToGeneric(gcsFile: GCSFile): File { | ||
return { | ||
isPublic: async () => { | ||
const [isPublic] = await gcsFile.isPublic() | ||
return isPublic | ||
}, | ||
exists: async () => (await gcsFile.exists())[0], | ||
download: async () => (await gcsFile.download())[0], | ||
bucket: gcsFile.bucket.name, | ||
publicUrl: () => gcsFile.publicUrl(), | ||
getMetadataMd5: async () => { | ||
const [metadata] = await gcsFile.getMetadata() | ||
return metadata.md5Hash | ||
}, | ||
} | ||
} | ||
|
||
downloadFile(bucket: string, filePath: string): Promise<File> { | ||
const file = this.storage.bucket(bucket).file(filePath) | ||
return Promise.resolve(this.convertFileToGeneric(file)) | ||
} | ||
|
||
async getFilesFromPrefix(bucket: string, prefix: string): Promise<File[]> { | ||
const [filesWithPrefix] = await this.storage | ||
.bucket(bucket) | ||
.getFiles({ prefix }) | ||
|
||
return filesWithPrefix.map((it: GCSFile) => this.convertFileToGeneric(it)) | ||
} | ||
|
||
async signedUrl( | ||
bucket: string, | ||
filePath: string, | ||
options: SignedUrlParameters | ||
): Promise<string> { | ||
const [url] = await this.storage | ||
.bucket(bucket) | ||
.file(filePath) | ||
.getSignedUrl({ ...options, version: 'v4' }) | ||
|
||
return url | ||
} | ||
|
||
upload( | ||
bucket: string, | ||
filePath: string, | ||
data: Buffer, | ||
options: { | ||
contentType?: string | ||
public?: boolean | ||
timeout?: number | ||
} | ||
): Promise<void> { | ||
return this.storage | ||
.bucket(bucket) | ||
.file(filePath) | ||
.save(data, { timeout: 30000, ...options }) | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,140 @@ | ||
import { SignedUrlParameters, StorageClient, File } from './StorageClient' | ||
import { | ||
GetObjectCommand, | ||
GetObjectCommandOutput, | ||
S3Client, | ||
ListObjectsV2Command, | ||
PutObjectCommand, | ||
} from '@aws-sdk/client-s3' | ||
import { getSignedUrl } from '@aws-sdk/s3-request-presigner' | ||
import type { Readable } from 'stream' | ||
|
||
// While this is listed as S3, for self hosting we will use MinIO, which is | ||
// S3 Compatible. | ||
export class S3StorageClient implements StorageClient { | ||
private s3Client: S3Client | ||
private urlOverride: string | undefined | ||
|
||
constructor(urlOverride: string | undefined) { | ||
this.urlOverride = urlOverride | ||
this.s3Client = new S3Client({ | ||
forcePathStyle: true, | ||
endpoint: urlOverride, | ||
}) | ||
} | ||
|
||
private convertFileToGeneric( | ||
s3File: GetObjectCommandOutput | ||
): Omit<File, 'bucket' | 'publicUrl'> { | ||
return { | ||
exists: () => { | ||
return Promise.resolve(s3File.$metadata.httpStatusCode == 200) | ||
}, | ||
isPublic: async () => Promise.resolve(true), | ||
download: async () => this.getFileFromReadable(s3File.Body as Readable), | ||
getMetadataMd5: () => Promise.resolve(s3File.ETag), | ||
} | ||
} | ||
|
||
private getFileFromReadable(stream: Readable): Promise<Buffer> { | ||
return new Promise<Buffer>((resolve, reject) => { | ||
const chunks: Buffer[] = [] | ||
stream.on('data', (chunk) => chunks.push(chunk)) | ||
stream.once('end', () => resolve(Buffer.concat(chunks))) | ||
stream.once('error', reject) | ||
}) | ||
} | ||
|
||
async downloadFile(bucket: string, filePath: string): Promise<File> { | ||
const s3File = await this.s3Client.send( | ||
new GetObjectCommand({ | ||
Bucket: bucket, | ||
Key: filePath, // path to the file you want to download, | ||
}) | ||
) | ||
|
||
return { | ||
...this.convertFileToGeneric(s3File), | ||
bucket: bucket, | ||
publicUrl: () => `${this.urlOverride ?? ''}/${bucket}/${filePath}`, | ||
} | ||
} | ||
|
||
async getFilesFromPrefix(bucket: string, prefix: string): Promise<File[]> { | ||
const s3PrefixedFiles = await this.s3Client.send( | ||
new ListObjectsV2Command({ | ||
Bucket: bucket, | ||
Prefix: prefix, // path to the file you want to download, | ||
}) | ||
) | ||
|
||
const prefixKeys = s3PrefixedFiles.CommonPrefixes || [] | ||
|
||
return prefixKeys | ||
.map(({ Prefix }) => Prefix) | ||
.map((key) => { | ||
return { | ||
exists: () => Promise.resolve(true), | ||
isPublic: async () => Promise.resolve(true), | ||
download: async () => { | ||
const s3File = await this.s3Client.send( | ||
new GetObjectCommand({ | ||
Bucket: bucket, | ||
Key: key, // path to the file you want to download, | ||
}) | ||
) | ||
|
||
return this.getFileFromReadable(s3File.Body as Readable) | ||
}, | ||
getMetadataMd5: () => Promise.resolve(key), | ||
bucket: bucket, | ||
publicUrl: () => `${this.urlOverride ?? ''}/${bucket}/${key}`, | ||
} | ||
}) | ||
} | ||
|
||
async signedUrl( | ||
bucket: string, | ||
filePath: string, | ||
options: SignedUrlParameters | ||
): Promise<string> { | ||
const command = | ||
options.action == 'read' | ||
? new GetObjectCommand({ | ||
Bucket: bucket, | ||
Key: filePath, // path to the file you want to download, | ||
}) | ||
: new PutObjectCommand({ | ||
Bucket: bucket, | ||
Key: filePath, // path to the file you want to download, | ||
}) | ||
|
||
// eslint-disable-next-line @typescript-eslint/no-unsafe-call | ||
const url = await getSignedUrl(this.s3Client, command, { | ||
expiresIn: 900, | ||
}) | ||
|
||
return url | ||
} | ||
|
||
upload( | ||
bucket: string, | ||
filePath: string, | ||
data: Buffer, | ||
options: { | ||
contentType?: string | ||
public?: boolean | ||
timeout?: number | ||
} | ||
): Promise<void> { | ||
return this.s3Client | ||
.send( | ||
new PutObjectCommand({ | ||
Bucket: bucket, | ||
Key: filePath, | ||
Body: data, | ||
}) | ||
) | ||
.then(() => {}) | ||
} | ||
} |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use Cloudflare R2 as well for self hosting? What was the decision behind using MinIO?
Asking because R2 is also S3 compatible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not actually familiar with R2 - but anything that is S3 Compatible should work. Let me take a look later to see whether or not the Storage Client I built works with it.
Minio was chosen because it can be self-hosted along with the rest of the application. There is a docker image, and it can all run on the same server without relying on anything external.
I'm trying to ensure everything here can be run self-contained without any need for external services.
That said, as with some of the email changes, I am looking into ways to simplify parts of it too, and having some external services is ok with me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To find suitable services, I recommend consulting r/self-hosted.
Love the work so far.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
S3 is a nice idea, provides various options, including self hosted ones.
How about local storage? This would reduce the required dependencies by one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh wow I didn't know Minio can be self-hosted! That sounds like a good idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The uploads are done via signed-urls, so while local-storage would be feasible it'd require a bit more development work.