Documentation - Architecture Overview — Files.com
Skip to main content

Architecture Overview

Files.com is a true cloud-native file orchestration and automation platform built from the ground up for modern scale, resilience, and flexibility.

Rather than retrofitting legacy file transfer systems into the cloud, Files.com is architected to auto-scale, recover, and serve in a globally distributed, multi-region model.

Our architecture is layered, with separation of concerns across user-facing clients, routing / proxy, core APIs, regional services, and storage + data infrastructure.

Each layer is designed for high availability, security, resilience, and observability.

Below is an expanded walkthrough of each layer, including additional nuance and detail where available.

Files.com Owned & Developed Client Applications (Client Tier)

Components & Variants

  • Files.com Web Application (browser interface)
  • Files.com Desktop Application v6 — for Windows and macOS
  • Files.com Mobile Apps — for iOS and Android
  • Files.com CLI — a command-line tool for power users and automation
  • Files.com SDKs — language-specific SDKs that provide supported connectivity to Files.com (e.g. Python, Go, Ruby, Node.js)
  • Files.com Boomi, MuleSoft, Zapier, and Microsoft Power Automate Integrations — custom integrations that we've built, but which are distributed through these iPaaS vendors as plugins or components

These client-side applications are built and maintained by Files.com, with the aim of tight integration, correct execution, and lightning fast performance.

Engineering & Development Practices

  • We develop all of our applications in house, and other than minor components, do not outsource development of any of our applications.
  • The Files.com platform takes an API-first approach: all primary client apps use the same published REST API endpoints. That means every UI action is backed by an API call.
  • There may be a small number of undocumented internal API endpoints (e.g. billing), but they are the exception, not the norm.
  • While the core is developed in-house, Files.com uses industry-standard and open-source components where appropriate (e.g. HTTP libraries, cryptographic primitives, open-source SDKs).
  • The development pipeline follows a secure software development lifecycle (SDLC) approach: code reviews, automated testing, static analysis, dependency scanning, CI/CD, and controlled release processes (e.g. blue/green or canary deployments).
  • Because all clients share the same API surface, new features roll out consistently and uniformly across web, mobile, CLI, and SDKs.
  • Our applications use our own SDKs internally wherever possible.

Benefits & Implications

  • Feature parity ensures that no client is “left behind” — whatever’s possible through the API is possible through UI or CLI.
  • Bug fixes or enhancements in backend logic benefit all clients immediately.
  • Having in-house clients gives us more control over user experience, performance tuning, and security.
  • The unified API approach reduces “shadow copy” or bespoke integrations that diverge over time.

Proxy / Edge / Routing Layer (Regional)

This layer is the front door to the platform: it handles incoming client connections, SSL termination for HTTPS, routing, and initial load balancing.

Files.com does not rely on AWS ELBs in this role; instead, we maintain a fleet of proxy servers that we control.

Key Functions & Features

  • Traffic Routing and Load Balancing
    • The proxies (running HAProxy, NGINX, and Dante) direct incoming requests to the appropriate backend microservices or API endpoints.
    • For HTTP/HTTPS, routing is done at layer 7 (based on URL, host, paths).
    • For SFTP, FTP, and FTPS, the same infrastructure is used to load-balance raw TCP/SSL (layer 4).
    • HAProxy (and similar proxy software) supports both layer-4 (TCP) and layer-7 (HTTP) modes.
    • The configuration supports zero-downtime code deployments — proxies can update routing rules without dropping existing connections, via techniques like socket handoff and hot reconfiguration.
  • SSL / TLS Termination
    • SSL/TLS is terminated at the proxy layer for HTTPS; decrypted traffic is then forwarded internally as needed. We re-encrypt internal HTTPS traffic as well.
    • By terminating at the edge, Files.com can manage certificates, and enforce strong cipher suites, and TLS settings centrally.
  • Customer-Dedicated IPs
    • For customers requiring static or dedicated IPs (e.g. to allow partner firewalls to whitelist), the proxies have those IP addresses mapped directly to them.
    • When a customer has two dedicated IPs, those are mapped to distinct proxy servers via AWS networking.
  • Regional Deployment & Redundancy
    • This proxy layer exists in all 7 regions.
    • Within each region, the proxies are configured in HA (high availability) clusters, with active-active failover powered by Round Robin DNS (RRDNS).
  • Security, Traffic Control & Edge Filtering
    • Proxies are the natural choke point for WAF rules, rate limiting, DDoS mitigation, IP filtering, access control lists (ACLs), and blocking invalid requests.
    • Logging and metrics from this layer are crucial for observability (latency, traffic volumes, error rates).
    • Proxy logs (both access and error logs) are sent to centralized log systems (via Filebeat) for auditing, alerting, and troubleshooting.
    • Care is taken to preserve client IPs downstream (e.g. via the PROXY protocol or X-Forwarded-For headers) so that our backend services see the true origin IPs.

Core API Layer (Central / Global)

This is the “brains” of the system: the set of microservices that expose business logic, metadata operations, orchestration, and the central control plane.

Stack & Technologies

  • The core APIs are largely implemented in Ruby — including our REST endpoints, authentication, orchestration, metadata, and control logic.
  • Underlying data technologies include:
    • MySQL / Amazon Aurora for relational metadata (users, permissions, configurations)
    • Redis for caching, fast ephemeral state, and in-memory operations
    • Memcached for caching
    • Elasticsearch for searching metadata, full-text search, log, and audit indexing
    • Amazon S3 for primary file storage (object store)
    • Amazon Redshift (for analytics, reporting)
  • Autoscaling is built into the API server fleet to accommodate variable load.

Multi-Tenancy & Isolation

  • Files.com is a multi-tenant system: customers share the same infrastructure physically, but are isolated logically (i.e. namespace separation).
  • Security boundaries exist such that one customer cannot see or impact another’s data or operations.

Inter-Service Communication

  • Microservices communicate via message queuing systems (including Redis queues, and ZeroMQ) for asynchronous tasks and event-driven workflows.
  • Synchronous RPC or HTTP calls are used for immediate, API-driven operations.

Background Jobs & Worker Pools

  • The core layer schedules and triggers various background or automation tasks (e.g. event triggers, file syncs, transformation jobs).
  • Centralized workers handle central tasks, but file content processing is always regionally executed in the Regional Services Layer (to respect data locality, latency, and cost).

Regional Services Layer (Edge / Protocol Services)

This is where protocol-specific functions live (SFTP, FTP, WebDAV, etc.), and where file data flows in/out. Because file content is bulky and latency-sensitive, this layer must reside near the client / data region.

Key Services

Each of the following is an independent service, with its own autoscaling pool of EC2 instances in each region:

  • FTP / FTPS
  • SFTP
  • WebDAV
  • Inbound S3 (i.e. accept uploads via S3 protocols)
  • Regional Integration Services (i.e. pushing or pulls of files to/from external endpoints)
  • Regional Job Processing (file movement, transformations, syncs within region)

These services handle data plane operations — e.g. actual file transfer, proxying, buffering, etc.

Additionally this layer includes additional region-specific Redis clusters for metadata and authentication caching.

Isolation & Security

  • Services are isolated at both the machine level (dedicated instances per service) and network level (via AWS Security Groups) to restrict allowed traffic.
  • Each region’s services live within its own VPC, and are interconnected via VPC peering for internal control-plane coordination.
  • Within each service, that service operates as multi-tenant system: customers share the same infrastructure physically, but are isolated logically by connection or RPC job.
  • All services connect to the same region-specific Redis cluster for metadata and authentication caching. One cluster is used for the entire region and we take great care to use logical separation within the Redis cluster to separate data by customer into separate regions of the key space.

Data Flow Dynamics

  • When a file operation arrives (say, client uploading over SFTP), it is handled by the regional service instance closest to the client (assigned via geo-DNS routing).
  • That service will interact with the central API layer for metadata, policy checks, and orchestration, but the heavy data flow is local to minimize cross-region data transfer.

Regional Integration Services

Regional Integration Services are the microservices that implement our integration with a given type of remote service (such as SharePoint, remote SFTP, etc.). Although we speak of these as one service, these are actually independent technical implementations (some in Ruby, some in Golang).

Storage, Data & Persistence Layer

This is the foundation that persists files, metadata, audit logs, and all underlying state.

Multi-Tier Architecture

  • Primary Storage (Object Store):
    • Customer files (blobs, file content) are stored in AWS S3 (encrypted at rest).
    • We use one master bucket for each region for customer data storage. Within that bucket, customer data is separated by key (folder). We centralize S3 signing in a manner that ensures only operations related to a single customer can sign an S3 request related to that customer's data.
  • Metadata / Indexing / Caching:
    • Amazon Aurora is used for tracking user accounts, configurations, file metadata, directory structure, permissions, and other relational data.
    • Redis is used for high-speed caching, ephemeral state, session data, locks, and queue coordination.
    • Elasticsearch is used for indexing metadata and supporting search, audit query, logs indexing.
  • Audit & Logging:
    • Many user actions and system events generate logs and audit records. These are streamed or batched into Elasticsearch (for query) and S3 (for long-term retention and archive).
    • Each log stream has a different retention policy, chosen to optimize compliance with regulatory and customer requirements.

External / Hybrid Storage Integration

  • Files.com can mount external storage endpoints (Azure Blob, Google Cloud Storage, S3-compatible endpoints, on-prem file servers, SharePoint, etc.) and present them in the unified namespace.
  • Access to these external storages is mediated through an Integration Service in the Regional Services Layer.
  • From the user’s perspective, external mounts behave like native directories — operations still go through Files.com for governance, permissions, audit, transformations, and movement.

Files.com Agent Layer

The Files.com Agent bridges the Files.com cloud platform with customer-controlled or private environments. It allows Files.com to reach securely into on-premises or private cloud storage without requiring inbound connections or complex networking configurations. This enables true hybrid file orchestration — one of Files.com’s key differentiators versus legacy MFT or EFSS systems.

Purpose & Overview

Many enterprise customers have data or systems that cannot leave their private environment, whether for compliance, data sovereignty, or performance reasons. The Files.com Agent allows Files.com to orchestrate files in place — performing transfers, synchronizations, automations, and audits on storage endpoints that are not directly exposed to the Internet.

In essence, the Agent Layer serves as a secure connector that:

  • Connects private storage (e.g., internal SFTP servers, NAS, SANs, SMB shares, or private S3 endpoints) to the Files.com orchestration engine.
  • Executes file operations locally under customer control.
  • Maintains end-to-end encryption and centralized audit trails.
  • Enables Files.com workflows (automations, triggers, syncs) to run against private data just as easily as cloud data.

Architecture & Design

The Files.com Agent runs as a lightweight daemon or service within the customer’s infrastructure — most commonly installed on:

  • A Linux or Windows server in a data center or private VPC.
  • A VM or container inside a private cloud (AWS, Azure, GCP, or private OpenStack).

Each Agent establishes a mutually authenticated, outbound encrypted connection to the Files.com cloud (Proxy layer). This ensures:

  • No inbound firewall changes required — only outbound HTTPS (port 443) traffic is needed.
  • Strong mutual identity verification — the Agent authenticates using an issued certificate tied to the customer’s account.
  • Encryption for all control and data-plane communications.

Internal Components

  • Agent Core — handles communication with Files.com control plane, job dispatch, and health monitoring.
  • RPC Job Executors — perform local file operations (copy, move, upload, download, sync).
  • Credential Vault — securely stores connection credentials for local resources.
  • Updater — ensures the Agent stays patched automatically and aligned with current Files.com versions.

The Agents are stateless from the Files.com perspective — multiple Agents can be deployed in parallel for scale or redundancy, all registered under the same Files.com environment.

Security and Compliance Characteristics

  • Outbound-only Connectivity: The Agent never exposes inbound ports, minimizing attack surface.
  • Per-Agent Identity: Each Agent has a unique identity and certificate; compromised Agents can be revoked immediately without affecting others.
  • Zero Data Persistence: The Agent is not a data store; for the most part it streams data directly during transfer, keeping no permanent local cache unless explicitly configured. Uploads are temporarily buffered to disk in the temporary directory of the destination disk drive. The agent sweeps these temporary disk buffers regularly.
  • Encryption Everywhere: All communications are encrypted; file data may also be encrypted at rest on both endpoints independently.
  • Customer Governance: The Agent runs in the customer’s administrative domain, allowing them to control OS patching, network configuration, and deployment topology.

Deployment & Operations

  • Agents are distributed via signed installers (Linux packages, Windows service executables).
  • Setup is simple: register the Agent via the Files.com Admin UI, download a tokenized configuration file, and launch the agent.
  • Once registered, the Agent maintains a persistent outbound session to the Files.com control plane, reporting health and receiving work items.
  • Load balancing among multiple Agents can be automatic: the Files.com orchestration engine distributes jobs across available Agents based on capacity and locality.

Typical enterprise deployments run two or more Agents per site for redundancy, often behind corporate firewalls or in segregated network segments.

Operational Visibility

  • Each Agent’s status (online/offline, version, last check-in) is visible in the Files.com Admin UI.
  • Alerts can be configured for Agent health events (disconnection, job failures, version drift).
  • Agents automatically receive updates and new job definitions from the Core API.

Performance & Scalability

  • Agents can process multiple jobs concurrently, limited only by local system resources and configured concurrency thresholds.
  • When multiple Agents are deployed, Files.com’s orchestration engine distributes workloads intelligently to prevent overloading any single Agent.
  • Network throughput is typically bounded by the customer’s outbound Internet capacity; Files.com optimizes data streaming with compression and multipart transfer techniques.

Infrastructure & Cross-Cutting Concerns

Networking & Inter-Region Connectivity

  • Regions are networked together (via VPC peering) for control-plane communication, routing, and failover.
  • Data locality is respected: file content is generally not moved across regions unless requested by a user.
  • Proxy / edge routing via geo-DNS ensures clients talk to the nearest region for latency and performance. Customers can mandate use of a single region through a sitewide setting.

Redundancy, Failover & High Availability

  • Every component is deployed in an HA configuration (multiple instances, across multiple availability zones.) We always use at least 4 availability zones in the USA region and 2 availability zones in all other regions.
  • The system supports both cross-AZ and cross-region failover: if a region fails, traffic and operations can be routed to another region, unless required by a site's settings.
  • Most services are configured to deploy changes using safe patterns (blue/green, canary, and rolling updates) to avoid customer-impacting downtime.

Security, Encryption & Key Management

  • In Transit: All client-to-proxy communication is encrypted via TLS (modern ciphers) or SSH (for SFTP) unless unencrypted FTP is manually enabled on a site.
  • At Rest: Files stored in S3 and other storage layers are encrypted (AES-256).
  • Key Management: Files.com handles keys internally (with appropriate rotation, isolation, and audit controls). Customers have options for adding additional encryption via GPG.
  • Identity & Access: Files.com offers multiple types of authentication (SSO, API keys, MFA), role-based access control, per-user and per-group permissions.
  • Time Synchronization & Clocking: All systems are synchronized via NTP to maintain consistency and reliable timestamps across logs.

Observability, Monitoring & Telemetry

  • Metrics exist at every layer: latency, throughput, error rates, system health.
  • We have centralized logging, tracing, and alerting (e.g. via Elastic, InfluxDB, CloudWatch, Sensu, and Grafana).
  • We have highly sophisticated internal dashboards for internal ops and support teams.
  • Customers have direct access to their logs both through the Files.com web app and the API. These queries are streamed directly from our master ElasticSearch clusters.
  • We have highly sophisticated automated alerts and remediation for degraded behavior.

Compliance & Governance

  • The architecture and hosting environment are reviewed in annual audits (e.g. SOC 2 Type II) and documented in compliance docs.
  • Controls exist for data retention, deletion, export, and data residency constraints (e.g. retaining data only in certain jurisdictions).

Get The File Orchestration Platform Today

4,000+ organizations trust Files.com for mission-critical file operations. Start your free trial now and build your first flow in 60 seconds.

No credit card required • 7-day free trial • Setup in minutes