Getting Started

To get started you need only the following

  1. An account with VGS, which gives you
    1. An account DNS name
    2. A proxy username/password pair
  2. An application that transmits data as XML or JSON over HTTPS
  3. Optionally, access to your DNS hosting service
  4. Optionally, an SSL certificate for your application’s domain name

Since all our services operate using forward or reverse proxies, there is no special client library required. Your existing application’s platform will work with little to no modifications. Forward proxy SSL depends on the endpoints support of SNI.

Redacting sensitive data

How do I get sensitive data into the VGS vault?

The fastest way to get data from your application into the VGS vault is to integrate our reverse proxy in your own application code. As a registered user, you get an account DNS name in the form <company>.<environment>.verygoodproxy.com. This name resolves to an endpoint that resembles the frontend in a traditional reverse proxy configuration. In the control panel you configure your own application endpoint as the backend. Once your application is configured, simply tell all your POST requests to use your account DNS name as the destination for the data to be sent. We handle the rest!

Your application is unique. We cannot determine what data is sensitive to you until you tell us. By default no keys are redacted. You define keys to redact with our rules engine. Configure rules in your account to match keys. Rules are expressed as XPath or JSONPath, depending on the content-type you send (application/json or application/xml).

In the following example, we define the account DNS name as demo.sandbox.verygoodproxy.com. The company is “demo” and the environment is “sandbox”. This environment allows you to develop your application with valid testing tokens before you go live and store user-generated tokens.

In this example, vault_token is configured in your account as the key to redact. The backend is set to a demonstration web application that passes the POST request to https://httpbin.org/post which is a service that echos back the data it receives. Try it by copying and pasting the following into a terminal. You need the curl application installed.

curl \
          -H 'Content-Type: application/json' \
          -d '{"vault_token":"some secret data", "some_other_field": "foo bar"}' \
          https://demo.sandbox.verygoodproxy.com/post
        

The response from curl should look something like

{
          "args": {},
          "data": "{\"vault_token\":\"tok_sandbox_34vnd2mQpHBjoEKH66zt59\",\"some_other_field\":\"foo bar\"}",
          "files": {},
          "form": {},
          "headers": {
            "Accept": "*/*",
            "Content-Length": "81",
            "Content-Type": "application/json",
            "Host": "httpbin.org",
            "User-Agent": "curl/7.49.1",
            "Vgs-Request-Id": "f24d38ab14fdf9bc",
            "Via": "1.1 c82848d53822"
          },
          "json": {
            "some_other_field": "foo bar",
            "vault_token": "tok_sandbox_34vnd2mQpHBjoEKH66zt59"
          },
          "origin": "219.88.166.97, 52.38.213.74",
          "url": "https://httpbin.org/post"
        }
        

You can see the request originated from curl, was encrypted with TLS 1.2 and the data in the POST body was not sent in the clear. You can see from the response that the JSON data from the POST body has the same keys as the input, but the rule to match the vault_token key was invoked and that value was redacted with a unique token in its place.

Your application could save this token in your database along with other account data. You do not have to store any sensitive data and you can refer to the sensitive data in your application with the unique token. For example to store credit card account records for return visitors so they don’t have to re-enter their card data. Since your application stores a token rather than the literal card data you are out of PCI scope and can move on to developing your application features.

Enriching redacted data

How do I get sensitive data out of the vault for my application to transmit to a financial/medical institution?

The easiest way to enrich data from your application to a financial or medical institution requires only a single change to your system. Set the upstream proxy for outgoing traffic to your account DNS name, in our demonstration case demo.sandbox.verygoodproxy.com on TCP port 8080. Set the username and password to authenticate to the proxy with your account credentials you obtained from your account control panel. Keeping consistent with our curl example, set the following environment variable in your terminal

export HTTPS_PROXY="https://demo-user@demo.sandbox.verygoodproxy.com:8080"

Then copy and paste the following

curl -L -k \
          -H 'Content-Type: application/json' \
          -d '{"vault_token": "tok_sandbox_34vnd2mQpHBjoEKH66zt59", "another field": "bar baz"}' \
          https://httpbin.org/post
        

What’s going on here? First, we export the environment variable HTTPS_PROXY which curl uses by default to find a proxy host:port setting. Then the -U parameter generates an authenticated session and sends the Proxy-Authorization header to the server. We don’t need a password in this demonstration which we specify with the trailing ’:’ character. Finally, the value of vault_token is the same token returned from the first example. The response should look something like this.

{
          "args": {},
          "data": "{\"vault_token\":\"some secret data\",\"another field\":\"bar baz\"}",
          "files": {},
          "form": {},
          "headers": {
            "Accept": "*/*",
            "Content-Length": "60",
            "Content-Type": "application/json",
            "Host": "httpbin.org",
            "User-Agent": "curl/7.49.1",
            "Vgs-Request-Id": "15d52ab1af3f34cd",
            "Via": "1.1 c82848d53822"
          },
          "json": {
            "another field": "bar baz",
            "vault_token": "some secret data"
          },
          "origin": "52.38.213.74",
          "url": "https://httpbin.org/post"
        }
        

Notice how the JSON data has been enriched with the original information? A real-world example would be using your application to send a POST request to your financial institution’s API with the de-tokenized data. Since your server is using the VGS proxy, your application never interacts directly with the sensitive data. The financial endpoint receives this enriched (now sensitive) data directly from the VGS proxy, keeping your system out of the compliance scope and leaving your application safe to move along.

Debugging

Introspecting Traffic

Request headers from your application will differ depending on whether you are tokenizing or de-tokenizing.

Tokenization

During tokenization, the reverse proxy operates in the same way as a front-end to a web application. For example, if you are hosting an application written in Java inside a Tomcat container, typically you will put Apache or Nginx in front of Tomcat in a reverse proxy configuration. Requests to your application endpoint begin at the Apache layer, which forwards the same request to the application inside of Tomcat. Request headers coming from your application usually look something like:

> POST /account/create HTTP/1.1
        > Host: demo.sandbox.verygoodproxy.com
        > User-Agent: curl/7.49.1
        > Accept: */*
        > content-type: application/json
        > Content-Length: 124
        >
        

content-type is important because the proxy uses this value to determine what rules engine to invoke. Ensure the value matches the type of structured data you wish to redact.

De-Tokenization

During de-tokenization, the request cycle is more complex. It typically looks something like:

* Connected to demo.sandbox.verygoodproxy.com (52.32.191.42) port 8080 (#0)
        * Establish HTTP proxy tunnel to example.com:443
        * Proxy auth using Basic with user '6657c588-dd89-4781-8505-3c8e42f1d30b'
        > CONNECT example.com:443 HTTP/1.1
        > Host: example.com:443
        > Proxy-Authorization: Basic NjY1N2M1ODgtZGQ4OS00NzgxLTg1MDUtM2M4ZTQyZjFkMzBiOg==
        > User-Agent: curl/7.49.1
        

Which establishes the initial connection to the remote endpoint. Notice that the protocol in the request is HTTP. This is the forward proxy informing the client that it opened a backend connection to the server you initially requested. What follows typically looks something like

* Proxy replied OK to CONNECT request
        * TLS 1.2 connection using TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
        * Server certificate: littleproxy
        > POST /account/transact/ HTTP/1.1
        > Host: example.com
        > User-Agent: curl/7.49.1
        > Accept: */*
        > content-type: application/json
        > Content-Length: 146
        >
        

The proxy informs your application that it successfully opened a backend connection and sent the de-tokenized data to the server specified in the Host header. The content-type header is as important here as it was during tokenization.

VIA header

Responses from both proxies (the reverse proxy for tokenization and the forward proxy for de-tokenization) have two headers you can use for maintaining states in your application.

> Via: 1.1 6f606af5cf35  
        > VGS-Request-Id: 63b555b58bea5b5a
        

The Via header is in the HTTP 1.1 spec and informs your server which proxies the request traveled through. The VGS-Request-Id contains a value which you can pass along during part of a session to maintain persistent tokens. This allows you to interact with the same token at different times during a session. For example if you update data which you do not need to redact after you created a token for redacted data. You can store the same token and the system knows by a unique fingerprint for the session.

How it works

Reverse proxying (for Tokenizing)

VGS’ reverse proxy works like your existing application server’s proxy. we act as your service’s front-end while your own application server’s proxy acts as the backend. From the client perspective, requests terminate at VGS proxy front-end. The client remains unaware of the application server side of the request.

note: One feature commonly used in reverse proxies is load balancing. VGS does not use this feature.

Forward Proxying (for De-Tokenizing)

A forward proxy works similar to a corporate firewall on an enterprise network. Outgoing requests are passed to the proxy before leaving the network to their final destination. The proxy can be configured to modify the content then forward it or forward it unaltered. From the destination perspective, requests appear to originate from the back-end of the proxy. Final recipients are not aware of the client side or the application server sides of the request.

Wait, are you a man-in-the-middle for all my traffic?

Yes, we are. We built a CDN (think CloudFlare), but with application level features.

The basic idea is to pretend to be the server to the client, and pretend to be the client to the server, while we sit in the middle decoding traffic from both sides. The tricky part is that the Certificate Authority system is designed to prevent exactly this attack, by allowing a trusted third-party to cryptographically sign a server’s SSL certificates to verify that they are legit. If this signature doesn’t match or is from a non-trusted party, a secure client will simply drop the connection and refuse to proceed. Despite the many shortcomings of the CA system as it exists today, this is usually fatal to attempts to MITM an SSL connection for analysis. Our answer to this conundrum is to become a trusted Certificate Authority ourselves. Mitmproxy includes a full CA implementation that generates interception certificates on the fly. To get the client to trust these certificates, we register mitmproxy as a trusted CA with the device manually.

The security implications here require a little attention. You have two options to securely configure your account.

  • The first option is to download our CA root certificate from the control panel and install it in your own application as described in the link. Generally this requires that you add our CA root certificate on each requesting endpoint. Depending on which platform you use, this process can vary quite a bit.

  • The second option is to use a signed CA certificate for your own domain name with our proxy as a SNI value. Our proxy will present this certificate to the client and everything validates along the trust chain. Since it is signed by a trusted root CA the client won’t complain about validation. Essentially, you are stating in your server’s certificate that you trust the VGS proxy.

Going Live

By changing two configuration values you can enable the VGS proxy for your application.

  1. Update your the back-end application server endpoint ip address to the current value of the DNS name in your application code
  2. Then update the the DNS name in your code to alias your VGS account DNS.

For example, let’s say you have a service you own that sends a POST request to the URL https://api.example.com/account/edit rather than change your source code to send the same request to https://company.sandbox.verygoodproxy.com/edit we can create a DNS record to map api.example.com to company.sandbox.verygoodproxy.com . We also handle all the SSL certificates for this domain name so your requests continue to be encrypted and secure. This feature requires you own the domain name and have the authority to change records in that domain.

Roadmap

Fingerprinting

Session fingerprinting is a feature which allows multiple requests which contain the same token to use the same token data on the backend. This is helpful if you need to update some keys but you do not want to generate new tokens for the same data if it does not change.

Static IP address per-connection

You can enable a feature where each request you send has a different proxy IP address as seen by the destination endpoint. This will anonymize your application since there is no way to trace the originating request backwards.

SFTP

Using any SFTP library and setting the ProxyCommand to your account DNS name will allow you to redact or enrich the contents of files you transfer. Since SFTP is a subset of SSH, the standard SSH proxying settings are used.