New Source: Email Connector #35253
Replies: 4 comments
-
@juhaelee that sounds like a really valuable use case. I'd love to dig into some of this to understand your context:
Realistically speaking, for Airbyte to do this really well, we'd need to have a highly-available component (like an always-on email server) to receive these emails. It's especially important for this to be highly available because there will be very few guarantees that these emails can be resent (e.g: automated email reports from services). So if an email gets sent while a user is e.g: restarting Airbyte or doing an upgrade, the data could be lost, so you can see how this is not a trivial task. Because Airbyte today is self-hosted, the responsibility of keeping those components available falls on the user. However, I think we can get pretty close by spinning up the highly available component in the user's cloud infrastructure where possible. For example, we could provide a template to spin up an API gateway and backing storage on AWS, GCP, and Azure, to help users collect these emails in an S3 bucket for example. Then, a user could use something like the File connector (plus some needed additions like #1874 and the ability to express a dynamic schema for a sync) to copy these files. Would something like the above be too much friction for you to consider this solution? We can potentially make it a lot easier by having Airbyte automatically spin up the infrastructure when given the right permissions. I'd have to think more about it, but just wanted to get your initial thoughts on it. |
Beta Was this translation helpful? Give feedback.
-
Hey Sherif, I do not use the connector today, but I work with various customers (disclosure- I work at Segment) who have described it as "Fivetran's killer feature". One customer I work with says it saves their data engineering countless hours by allowing different teams (marketing, product, etc.) to upload data into their data warehouse themselves without needing to go through the data team. My personal thought is that having Airbyte automatically spin up the infra when given the right permissions is the path of less friction. |
Beta Was this translation helpful? Give feedback.
-
Email source connector seems like a basic feature for an extract-load tool - Apache NiFi has it, Fivetran has it, Rivery offers it... I am surprised this questions did not catch more attention. There have also been another such request specifically for Gmail. For many small and even middle sized companies email attachments are still one of the main sources of data for their DWH. I wonder why it has not been created yet. Is there anything against such source connector in Airbyte polices (did not find aything against it)? If not, I would be glad to contribute and try to create it. |
Beta Was this translation helpful? Give feedback.
-
Why not use a regular SMTP email service that supports POP3 (or IMAP) as the mail receiver? Then, write a POP3 client (or IMAP) connector that translates the RFC822 email from any typical email store and sends it to the data warehouse. This will save you a lot of trouble checking DKIM and SPF. Most internet providers (Comcast, Digital Ocean, etc.) block port 25 these days in an effort to reduce spam so that could be a big problem for you usecase if you are running on one of those networks. What you said above is totally doable, but I think you will save yourself and the users of this connector a lot of headaches (DNS MX setup, SPF, DKIM, etc.) if you just interface at the mailbox layer instead of the SMTP layer. You could also consider companies like SparkPost (and I believe your sister company SendGrid) offer an inbound mail webhook. They parse the inbound RFC822 content from SMPT and deliver that as JSON webhook. That is also a way to solve this. I think Kumo MTA may also do this sort of thing that can let you focus on the translation to the data warehouse instead of all the E-Mail bits, which is not easy. My $.02 |
Beta Was this translation helpful? Give feedback.
-
Tell us about the problem you're trying to solve
Multiple teams may want to upload CSVs into their warehouse. Sending a CSV or JSON file via email would make the process easier.
Describe the solution you’d like
Ability to send a CSV or JSON file to an airbyte email. Perhaps generate a unique email: (ex- [email protected]). Then this would upload the CSV or JSON file into the data warehouse.
Describe the alternative you’ve considered or used
https://fivetran.com/blog/email-connector-uses
https://fivetran.com/docs/files/email
https://fivetran.com/docs/files/email/setup-guide
┆Issue is synchronized with this Asana task by Unito
Beta Was this translation helpful? Give feedback.
All reactions