Accounts and Snaps

Hi,

I have an account tied to a snap pack…basically every snap calls the Account.connect() function to get its access token, and if the token isn’t expired, the Account re-authorizes and get a new one. In fighting threw a bug while implementing threading for a snap pack, it led me to question how accounts work in SnapLogic.

Let’s supposed I have a simple pipeline:

  1. A get data snap…followed by
  2. A post data snap

If I had to auth in the first snap to generate a token, should that second snap be able to use that token as well? Or does each snap have its own instance of an account, which both then require separate auth requests?

I think you can use static variables in the runtime to help with this. Obviously a lot depends on threading and if this account is universal to all other processes running in the same runtime, so you probably have to protect yourself a little. I had a similar situation with multiple pipelines running with more than one account and it worked out pretty well.

Abe

This is a surprisingly difficult issue and it’s under active research for the best approach.

The short answer is that each snap is an independent process and each account for each snap is also an independent instance. That’s not a problem - as you suggest we can use the account information to create a key into a static cache in order to retrieve a previously established connection.

The not-so-short answer is that we recently switched to using a single classloader for all snaps and accounts in a pipeline but then encountered a situation where one pipeline had snaps with accounts that depend upon mutually incompatible driver jars. I know we were discussing reintroducing separate classloaders for accounts but I don’t know the details. E.g., are the classes marked with an @Account annotation loaded via a different classloader, or only the objects set via Guice injection? Let me know if you need to know the details.

In practice? You should be fine using a static cache (e.g., a Guice Cache object, or a WeaklyLinkedMap) as long as your cache is declared to use the most general thing possible. E.g., return a DataSource instead of an OracleDataSource, much less an Oracle12xArticulatingGumboDataSource.

As an aside - unless you have a compelling reason to do otherwise your cache should contain factories instead of specific instances. JDBC DataSources instead of JDBC Connections, JMS Connections instead of JMS Sessions. It’s not any more difficult to implement and it guarantees each snap gets a fresh, thread-safe copy.