Cluster with file storage Deployment β
This deployment consists of 4 instances:
- 1 Commander instance with file storage;
- 2 Master instances;
- 1 Refresh instance.
For additional details, refer to the Architecture section.
WARNING
Use a separate instance for running Scrapoxy at scale. Donβt run VPNs or complicated network setups on the same machine, as this can cause connection problems, especially when scaling up.
Docker Compose β
Create a haproxy.cfg file with the following content:
resolvers default
parse-resolv-conf
frontend master
mode tcp
bind :8888
default_backend all
backend all
mode tcp
server-template master 2 master:8888 check init-addr last,none resolvers defaultAdjust the server-template directive to align the required number of Master instances based on the desired sizing.
WARNING
Ensure there is an empty line at the end of the haproxy.cfg file.
In the same directory, create a docker-compose.yml file with the following content:
services:
commander:
image: scrapoxy/scrapoxy
command: "node scrapoxy.js start -f -c --storage file"
ports:
- 8890:8890
environment:
- NODE_ENV=production
- AUTH_LOCAL_USERNAME=admin
- AUTH_LOCAL_PASSWORD=password
- BACKEND_JWT_SECRET=secret1
- FRONTEND_JWT_SECRET=secret2
- STORAGE_FILE_FILENAME=/etc/scrapoxy/config.json
volumes:
- ./scrapoxy:/etc/scrapoxy
haproxy:
image: haproxy
ports:
- 8888:8888
volumes:
- ./haproxy.cfg:/usr/local/etc/haproxy/haproxy.cfg:ro
master:
image: scrapoxy/scrapoxy
command: "node scrapoxy.js start -m"
deploy:
mode: replicated
replicas: 2
endpoint_mode: dnsrr
environment:
- NODE_ENV=production
- COMMANDER_URL=http://commander:8890/api
- BACKEND_JWT_SECRET=secret1
links:
- commander
refresh:
image: scrapoxy/scrapoxy
command: "node scrapoxy.js start -r"
environment:
- NODE_ENV=production
- COMMANDER_URL=http://commander:8890/api
- BACKEND_JWT_SECRET=secret1
links:
- commanderMake sure to update the replicas field un master service to match the number of Master instances required.
Run the following command:
docker compose up -dKubernetes with Helm β
Create an umbrella chart with Helm with the following files structure:
myscrapoxy/
|- Chart.yaml
|- values.yamlEdit Chart.yaml with the following content:
apiVersion: v2
name: scrapoxy-simple-cluster
description: Scrapoxy Simple Cluster
type: application
version: 1.0.0
appVersion: latest
dependencies:
- name: scrapoxy-simple-cluster
version: 1.0.0
repository: https://charts.scrapoxy.ioEdit values.yaml with the following content:
scrapoxy-simple-cluster:
auth:
local:
username: "admin"
password: "password"
backendJwtSecret: "secret1"
frontendJwtSecret: "secret2"Build dependencies:
helm dependency build ./myscrapoxyInstall the chart:
helm install scrapoxy ./myscrapoxySizing β
The master instance handles all requests, so it's the one to scale the most.
Hereβs a reference table:
| Max concurrent requests | Number of Masters | Number of vCPU |
|---|---|---|
| 100 | 1 | 2 |
| 200 | 2 | 3 |
| 500 | 3 | 4 |
| 1000 | 4 | 5 |