Architecture Overview

Files.com is a true cloud-native file orchestration and automation platform built from the ground up for modern scale, resilience, and flexibility.

Rather than retrofitting legacy file transfer systems into the cloud, Files.com is architected to auto-scale, recover, and serve in a globally distributed, multi-region model.

Our architecture is layered, with separation of concerns across user-facing clients, routing / proxy, core APIs, regional services, and storage + data infrastructure.

Each layer is designed for high availability, security, resilience, and observability.

Below is an expanded walkthrough of each layer, including additional nuance and detail where available.

Files.com Owned & Developed Client Applications (Client Tier)

Components & Variants

Files.com Web Application (browser interface)
Files.com Desktop Application v6 — for Windows and macOS
Files.com Mobile Apps — for iOS and Android
Files.com CLI — a command-line tool for power users and automation
Files.com SDKs — language-specific SDKs that provide supported connectivity to Files.com (e.g. Python, Go, Ruby, Node.js)
Files.com Boomi, MuleSoft, Zapier, and Microsoft Power Automate Integrations — custom integrations that we've built, but which are distributed through these iPaaS vendors as plugins or components

These client-side applications are built and maintained by Files.com, with the aim of tight integration, correct execution, and lightning fast performance.

Engineering & Development Practices

We develop all of our applications in house, and other than minor components, do not outsource development of any of our applications.
The Files.com platform takes an API-first approach: all primary client apps use the same published REST API endpoints. That means every UI action is backed by an API call.
There may be a small number of undocumented internal API endpoints (e.g. billing), but they are the exception, not the norm.
While the core is developed in-house, Files.com uses industry-standard and open-source components where appropriate (e.g. HTTP libraries, cryptographic primitives, open-source SDKs).
The development pipeline follows a secure software development lifecycle (SDLC) approach: code reviews, automated testing, static analysis, dependency scanning, CI/CD, and controlled release processes (e.g. blue/green or canary deployments).
Because all clients share the same API surface, new features roll out consistently and uniformly across web, mobile, CLI, and SDKs.
Our applications use our own SDKs internally wherever possible.

Benefits & Implications

Feature parity ensures that no client is “left behind” — whatever’s possible through the API is possible through UI or CLI.
Bug fixes or enhancements in backend logic benefit all clients immediately.
Having in-house clients gives us more control over user experience, performance tuning, and security.
The unified API approach reduces “shadow copy” or bespoke integrations that diverge over time.

Proxy / Edge / Routing Layer (Regional)

This layer is the front door to the platform: it handles incoming client connections, SSL termination for HTTPS, routing, and initial load balancing.

Files.com does not rely on AWS ELBs in this role; instead, we maintain a fleet of proxy servers that we control.

Key Functions & Features

Traffic Routing and Load Balancing
- The proxies (running HAProxy, NGINX, and Dante) direct incoming requests to the appropriate backend microservices or API endpoints.
- For HTTP/HTTPS, routing is done at layer 7 (based on URL, host, paths).
- For SFTP, FTP, and FTPS, the same infrastructure is used to load-balance raw TCP/SSL (layer 4).
- HAProxy (and similar proxy software) supports both layer-4 (TCP) and layer-7 (HTTP) modes.
- The configuration supports zero-downtime code deployments — proxies can update routing rules without dropping existing connections, via techniques like socket handoff and hot reconfiguration.
SSL / TLS Termination
- SSL/TLS is terminated at the proxy layer for HTTPS; decrypted traffic is then forwarded internally as needed. We re-encrypt internal HTTPS traffic as well.
- By terminating at the edge, Files.com can manage certificates, and enforce strong cipher suites, and TLS settings centrally.
Customer-Dedicated IPs
- For customers requiring static or dedicated IPs (e.g. to allow partner firewalls to whitelist), the proxies have those IP addresses mapped directly to them.
- When a customer has two dedicated IPs, those are mapped to distinct proxy servers via AWS networking.
Regional Deployment & Redundancy
- This proxy layer exists in all 7 regions.
- Within each region, the proxies are configured in HA (high availability) clusters, with active-active failover powered by Round Robin DNS (RRDNS).
Security, Traffic Control & Edge Filtering
- Proxies are the natural choke point for WAF rules, rate limiting, DDoS mitigation, IP filtering, access control lists (ACLs), and blocking invalid requests.
- Logging and metrics from this layer are crucial for observability (latency, traffic volumes, error rates).
- Proxy logs (both access and error logs) are sent to centralized log systems (via Filebeat) for auditing, alerting, and troubleshooting.
- Care is taken to preserve client IPs downstream (e.g. via the PROXY protocol or X-Forwarded-For headers) so that our backend services see the true origin IPs.

Core API Layer (Central / Global)

This is the “brains” of the system: the set of microservices that expose business logic, metadata operations, orchestration, and the central control plane.

Stack & Technologies

The core APIs are largely implemented in Ruby — including our REST endpoints, authentication, orchestration, metadata, and control logic.
Underlying data technologies include:
- MySQL / Amazon Aurora for relational metadata (users, permissions, configurations)
- Redis for caching, fast ephemeral state, and in-memory operations
- Memcached for caching
- Elasticsearch for searching metadata, full-text search, log, and audit indexing
- Amazon S3 for primary file storage (object store)
- Amazon Redshift (for analytics, reporting)
Autoscaling is built into the API server fleet to accommodate variable load.

Multi-Tenancy & Isolation

Files.com is a multi-tenant system: customers share the same infrastructure physically, but are isolated logically (i.e. namespace separation).
Security boundaries exist such that one customer cannot see or impact another’s data or operations.

Inter-Service Communication

Microservices communicate via message queuing systems (including Redis queues, and ZeroMQ) for asynchronous tasks and event-driven workflows.
Synchronous RPC or HTTP calls are used for immediate, API-driven operations.

Background Jobs & Worker Pools

The core layer schedules and triggers various background or automation tasks (e.g. event triggers, file syncs, transformation jobs).
Centralized workers handle central tasks, but file content processing is always regionally executed in the Regional Services Layer (to respect data locality, latency, and cost).

Regional Services Layer (Edge / Protocol Services)

This is where protocol-specific functions live (SFTP, FTP, WebDAV, etc.), and where file data flows in/out. Because file content is bulky and latency-sensitive, this layer must reside near the client / data region.

Key Services

Each of the following is an independent service, with its own autoscaling pool of EC2 instances in each region:

FTP / FTPS
SFTP
WebDAV
Inbound S3 (i.e. accept uploads via S3 protocols)
Regional Integration Services (i.e. pushing or pulls of files to/from external endpoints)
Regional Job Processing (file movement, transformations, syncs within region)

These services handle data plane operations — e.g. actual file transfer, proxying, buffering, etc.

Additionally this layer includes additional region-specific Redis clusters for metadata and authentication caching.

Isolation & Security

Services are isolated at both the machine level (dedicated instances per service) and network level (via AWS Security Groups) to restrict allowed traffic.
Each region’s services live within its own VPC, and are interconnected via VPC peering for internal control-plane coordination.
Within each service, that service operates as multi-tenant system: customers share the same infrastructure physically, but are isolated logically by connection or RPC job.
All services connect to the same region-specific Redis cluster for metadata and authentication caching. One cluster is used for the entire region and we take great care to use logical separation within the Redis cluster to separate data by customer into separate regions of the key space.

Data Flow Dynamics

When a file operation arrives (say, client uploading over SFTP), it is handled by the regional service instance closest to the client (assigned via geo-DNS routing).
That service will interact with the central API layer for metadata, policy checks, and orchestration, but the heavy data flow is local to minimize cross-region data transfer.

Regional Integration Services

Regional Integration Services are the microservices that implement our integration with a given type of remote service (such as SharePoint, remote SFTP, etc.). Although we speak of these as one service, these are actually independent technical implementations (some in Ruby, some in Golang).

Storage, Data & Persistence Layer

This is the foundation that persists files, metadata, audit logs, and all underlying state.

Multi-Tier Architecture

Primary Storage (Object Store):
- Customer files (blobs, file content) are stored in AWS S3 (encrypted at rest).
- We use one master bucket for each region for customer data storage. Within that bucket, customer data is separated by key (folder). We centralize S3 signing in a manner that ensures only operations related to a single customer can sign an S3 request related to that customer's data.
Metadata / Indexing / Caching:
- Amazon Aurora is used for tracking user accounts, configurations, file metadata, directory structure, permissions, and other relational data.
- Redis is used for high-speed caching, ephemeral state, session data, locks, and queue coordination.
- Elasticsearch is used for indexing metadata and supporting search, audit query, logs indexing.
Audit & Logging:
- Many user actions and system events generate logs and audit records. These are streamed or batched into Elasticsearch (for query) and S3 (for long-term retention and archive).
- Each log stream has a different retention policy, chosen to optimize compliance with regulatory and customer requirements.

External / Hybrid Storage Integration

Files.com can mount external storage endpoints (Azure Blob, Google Cloud Storage, S3-compatible endpoints, on-prem file servers, SharePoint, etc.) and present them in the unified namespace.
Access to these external storages is mediated through an Integration Service in the Regional Services Layer.
From the user’s perspective, external mounts behave like native directories — operations still go through Files.com for governance, permissions, audit, transformations, and movement.

Files.com Agent Layer

The Files.com Agent bridges the Files.com cloud platform with customer-controlled or private environments. It allows Files.com to reach securely into on-premises or private cloud storage without requiring inbound connections or complex networking configurations. This enables true hybrid file orchestration — one of Files.com’s key differentiators versus legacy MFT or EFSS systems.

Purpose & Overview

Many enterprise customers have data or systems that cannot leave their private environment, whether for compliance, data sovereignty, or performance reasons. The Files.com Agent allows Files.com to orchestrate files in place — performing transfers, synchronizations, automations, and audits on storage endpoints that are not directly exposed to the Internet.

In essence, the Agent Layer serves as a secure connector that:

Connects private storage (e.g., internal SFTP servers, NAS, SANs, SMB shares, or private S3 endpoints) to the Files.com orchestration engine.
Executes file operations locally under customer control.
Maintains end-to-end encryption and centralized audit trails.
Enables Files.com workflows (automations, triggers, syncs) to run against private data just as easily as cloud data.

Architecture & Design

The Files.com Agent runs as a lightweight daemon or service within the customer’s infrastructure — most commonly installed on:

A Linux or Windows server in a data center or private VPC.
A VM or container inside a private cloud (AWS, Azure, GCP, or private OpenStack).

Each Agent establishes a mutually authenticated, outbound encrypted connection to the Files.com cloud (Proxy layer). This ensures:

No inbound firewall changes required — only outbound HTTPS (port 443) traffic is needed.
Strong mutual identity verification — the Agent authenticates using an issued certificate tied to the customer’s account.
Encryption for all control and data-plane communications.

Internal Components

Agent Core — handles communication with Files.com control plane, job dispatch, and health monitoring.
RPC Job Executors — perform local file operations (copy, move, upload, download, sync).
Credential Vault — securely stores connection credentials for local resources.
Updater — ensures the Agent stays patched automatically and aligned with current Files.com versions.

The Agents are stateless from the Files.com perspective — multiple Agents can be deployed in parallel for scale or redundancy, all registered under the same Files.com environment.

Security and Compliance Characteristics

Outbound-only Connectivity: The Agent never exposes inbound ports, minimizing attack surface.
Per-Agent Identity: Each Agent has a unique identity and certificate; compromised Agents can be revoked immediately without affecting others.
Zero Data Persistence: The Agent is not a data store; for the most part it streams data directly during transfer, keeping no permanent local cache unless explicitly configured. Uploads are temporarily buffered to disk in the temporary directory of the destination disk drive. The agent sweeps these temporary disk buffers regularly.
Encryption Everywhere: All communications are encrypted; file data may also be encrypted at rest on both endpoints independently.
Customer Governance: The Agent runs in the customer’s administrative domain, allowing them to control OS patching, network configuration, and deployment topology.

Deployment & Operations

Agents are distributed via signed installers (Linux packages, Windows service executables).
Setup is simple: register the Agent via the Files.com Admin UI, download a tokenized configuration file, and launch the agent.
Once registered, the Agent maintains a persistent outbound session to the Files.com control plane, reporting health and receiving work items.
Load balancing among multiple Agents can be automatic: the Files.com orchestration engine distributes jobs across available Agents based on capacity and locality.

Typical enterprise deployments run two or more Agents per site for redundancy, often behind corporate firewalls or in segregated network segments.

Operational Visibility

Each Agent’s status (online/offline, version, last check-in) is visible in the Files.com Admin UI.
Alerts can be configured for Agent health events (disconnection, job failures, version drift).
Agents automatically receive updates and new job definitions from the Core API.

Performance & Scalability

Agents can process multiple jobs concurrently, limited only by local system resources and configured concurrency thresholds.
When multiple Agents are deployed, Files.com’s orchestration engine distributes workloads intelligently to prevent overloading any single Agent.
Network throughput is typically bounded by the customer’s outbound Internet capacity; Files.com optimizes data streaming with compression and multipart transfer techniques.

Infrastructure & Cross-Cutting Concerns

Networking & Inter-Region Connectivity

Regions are networked together (via VPC peering) for control-plane communication, routing, and failover.
Data locality is respected: file content is generally not moved across regions unless requested by a user.
Proxy / edge routing via geo-DNS ensures clients talk to the nearest region for latency and performance. Customers can mandate use of a single region through a sitewide setting.

Redundancy, Failover & High Availability

Every component is deployed in an HA configuration (multiple instances, across multiple availability zones.) We always use at least 4 availability zones in the USA region and 2 availability zones in all other regions.
The system supports both cross-AZ and cross-region failover: if a region fails, traffic and operations can be routed to another region, unless required by a site's settings.
Most services are configured to deploy changes using safe patterns (blue/green, canary, and rolling updates) to avoid customer-impacting downtime.

Security, Encryption & Key Management

In Transit: All client-to-proxy communication is encrypted via TLS (modern ciphers) or SSH (for SFTP) unless unencrypted FTP is manually enabled on a site.
At Rest: Files stored in S3 and other storage layers are encrypted (AES-256).
Key Management: Files.com handles keys internally (with appropriate rotation, isolation, and audit controls). Customers have options for adding additional encryption via GPG.
Identity & Access: Files.com offers multiple types of authentication (SSO, API keys, MFA), role-based access control, per-user and per-group permissions.
Time Synchronization & Clocking: All systems are synchronized via NTP to maintain consistency and reliable timestamps across logs.

Observability, Monitoring & Telemetry

Metrics exist at every layer: latency, throughput, error rates, system health.
We have centralized logging, tracing, and alerting (e.g. via Elastic, InfluxDB, CloudWatch, Sensu, and Grafana).
We have highly sophisticated internal dashboards for internal ops and support teams.
Customers have direct access to their logs both through the Files.com web app and the API. These queries are streamed directly from our master ElasticSearch clusters.
We have highly sophisticated automated alerts and remediation for degraded behavior.

Compliance & Governance

The architecture and hosting environment are reviewed in annual audits (e.g. SOC 2 Type II) and documented in compliance docs.
Controls exist for data retention, deletion, export, and data residency constraints (e.g. retaining data only in certain jurisdictions).

End-to-End Flow Examples

These examples illustrate how Files.com orchestrates data across different protocols and destinations. While each workflow may vary slightly depending on configuration and destination type, the following outlines represent the most common flows.

File Upload from a Files.com Client to Native Files.com Storage

Here’s a simplified sequence illustrating a file upload from a client to Files.com native storage:

A Files.com client (Web App, Desktop App, CLI, or SDK) initiates an upload by making an API request to the Files.com API.
The request is routed via geo-DNS to the nearest regional proxy.
- The proxy terminates TLS, performs first-level WAF and security checks, and forwards the request to the Core API.
The API validates the session, applies rate-limit and permission checks, and creates an upload object that includes a securely signed URL pointing to a customer-scoped path in Files.com’s Amazon S3 bucket.
The client then opens a direct connection to Amazon S3 using that signed URL and uploads the file data.
- In this flow, the file data travels directly from the client to S3 and does not traverse Files.com’s internal infrastructure.
After the upload completes, the client makes another API request (via the proxy) to confirm completion.
The Files.com API validates the uploaded object and triggers background jobs as appropriate (e.g., GPG encryption, metadata indexing, transformations, webhooks, and audit entries).
- These jobs execute in the regional or central worker pools depending on their scope.
Audit logs are generated at every stage — proxy access logs, API logs, and historical user activity logs.

File Upload from an SFTP Client to Files.com Storage

Here’s a simplified sequence illustrating a file upload from a client to Files.com native storage:

An external client initiates an SFTP connection (port 22) to app.files.com, or to a customer-specific subdomain or custom domain.
The connection is routed via geo-DNS to the nearest regional proxy (or region assigned to the subdomain), which forwards it to the SFTP service in the Regional Services Layer.
The SFTP service performs user authentication by opening an API session with the Files.com Core API on the client’s behalf.
- This session is cached in Redis for reuse in future SFTP connections.
The SFTP service then initiates an internal API request to create an upload object, bypassing the proxy layer for efficiency.
- The API validates the session, rate limits, and permissions, and returns a securely signed S3 upload URL.
The SFTP service opens a direct connection to Amazon S3.
The client streams file data via SFTP to the regional SFTP service, which forwards it directly to S3 using in-memory buffering — data is never written to disk.
Once the upload completes, the SFTP service notifies the API internally, marking the upload complete.
The API triggers background jobs and logging, just as in the previous workflows.

File Copy or Sync from Remote Server (i.e. SharePoint, remote SFTP, etc.) to Files.com

Here’s a simplified sequence illustrating the copying of a file from a remote server (such as SharePoint or a remote SFTP) to Files.com.

A copy or sync operation is initiated — either manually by a user or automatically through a scheduled sync or automation. This results in a job being created and executed within the central worker pool.
The central worker issues two RPC jobs to the Regional Integration Service:
- Source RPC: Sent to the service worker in the region associated with the source (e.g., SharePoint or remote SFTP).
  - This worker returns a secure signed URL that represents the source file stream.
- Destination RPC: Sent to the service worker in the region associated with the destination (in this case, the target Files.com region), providing the signed URL returned from the source.
The destination service worker connects to the source service worker using that signed URL.
- The source service downloads the file from its remote endpoint and streams it directly to the destination service.
- The destination service, in turn, streams the file to its final storage destination (in this case, Amazon S3).
The entire operation is streamed in real time with in-memory buffering only — no intermediate files are ever written to disk, even for Box and OneDrive remotes.
Upon completion, the central worker job finalizes metadata, triggers other applicable background jobs, and records full audit logs for the transaction.

File Upload from a Files.com Client to SharePoint

Here’s a simplified sequence illustrating a file upload from a client directly to a remote server destination such as SharePoint:

The client initiates an upload to a SharePoint-mounted destination via the Files.com API.
The request is routed through geo-DNS to the nearest regional proxy, which terminates TLS, performs security checks, and forwards the request to the API layer.
The API performs session, rate-limit, and permission validations.
The API then issues an RPC job to the Regional Integration Service in the region associated with the SharePoint integration.
- This service generates a securely signed upload URL and returns it to the API.
The API forwards that signed URL to the client.
The client opens a new connection to the signed URL, which routes through a Files.com proxy in the target region to a Regional Integration Service worker.
The Integration Service accepts the incoming data stream and writes it directly to SharePoint using in-memory buffering.
- Buffers are discarded immediately upon completion.
After data transmission, the API triggers background jobs (as in the previous flow) and emits audit logs for all actions.

Buffered Upload Flow

Uploads to remote destinations in some circumstances use an alternate flow, whereby the file is first uploaded to Files.com Native Storage, and then copied to the remote destination.

This buffered upload flow occurs in the following circumstances:

Uploads to OneDrive or Box (their APIs require the full file before upload).
Uploads into folders configured for GPG encryption or decryption.
Uploads to any remote server which has been explicitly configured by the Site Admin to require buffered uploading via Files.com Native Storage.

These buffer locations are ephemeral and swept daily. In these scenarios, the flow will resemble the File Upload from a Files.com Client to Native Files.com Storage flow above, with the addition of an additional step to move the file to the remote server. That additional step will resemble the File Copy or Sync from Remote Server to Files.com flow described above, except in the reverse direction.

Architecture Overview

Ready to Transform Your File Infrastructure?