Environment variables
Scrapoxy can be customized with environment variables to align with specific infrastructure or setup requirements.
Security
Commander
BACKEND_JWT_SECRET: Secret of the JWT used for internal connections.BACKEND_JWT_EXPIRATION: Duration of the JWT for internal connections. The default value is60s.
User interface
FRONTEND_JWT_SECRET: Secret of the JWT used for users connections.FRONTEND_JWT_EXPIRATION: Duration of the JWT for users connections. The default value is24h.FRONTEND_SECURE_COOKIE: Enable the secure flag on the authentication cookie by setting it to1, especially if the Scrapoxy UI is located behind an SSL reverse proxy like Nginx. The default value is0.
WARNING
To start Scrapoxy, it is mandatory to set the secrets BACKEND_JWT_SECRET and FRONTEND_JWT_SECRET.
Network
Commander
COMMANDER_PORT: Port of the Commander API. The default value is8890.COMMANDER_URL: Commander exposed API URL. It is required if using a distributed configuration. The default value ishttp://localhost:COMMANDER_PORT/api.
User interface
FRONTEND_URL: Default URL of the frontend, used for authentication. The default value ishttp://localhost:8890.
Master
MASTER_PORT: Port of the Master. The default value is8888.MASTER_TIMEOUT: Timeout in milliseconds when the Master relays a request to a proxy endpoint. The default value is60000(1 minute).MASTER_CERTIFICATE_CERT: Certificate to activate TLS support on the MasterMASTER_CERTIFICATE_KEY: Secret key of the certificate. Both values must be set to activate TLS.
Probe
PROBE_PORT: Port of the probe that checks if storage is alive. The default value is8887.
Fingerprint
FINGERPRINT_URL: URL of the fingerprint server to retrieve proxy information. The default value ishttps://fingerprint.scrapoxy.io/api/json.FINGERPRINT_FOLLOW_REDIRECT_MAX: Max number of HTTP redirects allowed when requesting the fingerprint server. The default value is3.FINGERPRINT_RETRY_MAX: Maximum retries before stating a proxy or freeproxy are inaccessible. The default value is2.FINGERPRINT_TIMEOUT: Timeout in milliseconds of request when requesting the fingerprint server. The default value is5000(5 seconds).
Man-in-the-middle
MITM_CERTIFICATE_DURATION: Duration in milliseconds of generated website TLS certificates. The default value is31536000000(1 year).
Authentication
Username/password
To activate basic authentication, set all the following environment variables:
AUTH_LOCAL_USERNAME: UsernameAUTH_LOCAL_PASSWORD: Password
Github
To activate Github OAuth authentication, set all the following environment variables:
AUTH_GITHUB_CLIENT_ID: Github Client IDAUTH_GITHUB_SECRET: Github SecretAUTH_GITHUB_CALLBACK_URL: Callback URL. The default value isFRONTEND_URL/api/users/auths/github.
Google
To activate Google OAuth authentication, set all the following environment variables:
AUTH_GOOGLE_CLIENT_ID: Google Client IDAUTH_GOOGLE_SECRET: Google SecretAUTH_GOOGLE_CALLBACK_URL: Callback URL. The default value isFRONTEND_URL/api/users/auths/google.
Storage
File
If you are using file storage, configure the following environment variables:
STORAGE_FILE_FILENAME: Filename of the local configuration file. The default value isscrapoxy.json.STORAGE_FILE_CERTIFICATES_MAX: Maximum number of TLS certificates cached in memory for file storage. The default value is1000.
Distributed
If you are using distributed storage, configure the following environment variables:
For MongoDB:
STORAGE_DISTRIBUTED_MONGO_URI: URI of MongoDB server. The default value ismongodb://user:password@localhost.STORAGE_DISTRIBUTED_MONGO_DB: Name of MongoDB database. The default value isscrapoxy.STORAGE_DISTRIBUTED_MONGO_CERTIFICATES_SZ: Maximum size in bytes of TLS certificates cached in MongoDB. The default value is268435456(256 MB).
For RabbitMQ:
STORAGE_DISTRIBUTED_RABBITMQ_URL: URL of RabbitMQ server. The default value isamqp://user:password@localhost:5672.STORAGE_DISTRIBUTED_RABBITMQ_QUEUE_ORDERS: Queue of RabbitMQ to send CQRS orders. The default value isscrapoxyorders.STORAGE_DISTRIBUTED_RABBITMQ_QUEUE_EVENTS: Queue of RabbitMQ to receive CQRS events. The default value isscrapoxyevents.
Refresh
Connectors
CONNECTORS_REFRESH_EMPTY_DELAY: Delay in milliseconds to wait if there is no connector to refresh. The default value is1000(1 second).CONNECTORS_REFRESH_ERROR_DELAY: Delay in milliseconds to wait if connector's refresh triggers an error. The default value is2000(2 seconds).
Proxies
PROXY_REFRESH_COUNT: Number of proxies to fingerprint at once. The default value is200.PROXY_REFRESH_DELAY: Delay in milliseconds between 2 fingerprint requests of a proxy, adjusted by subtracting the timeout duration. The default value is1000(1 seconds).PROXIES_REFRESH_EMPTY_DELAY: Delay in milliseconds to wait if there is no proxy to refresh. The default value is1000(1 second).PROXIES_REFRESH_ERROR_DELAY: Delay in milliseconds to wait if proxy's refresh triggers an error. The default value is2000(2 seconds).
Freeproxies
FREEPROXY_REFRESH_COUNT: Number of freeproxies to fingerprint at once. The default value is100.FREEPROXY_REFRESH_DELAY: Delay in milliseconds between 2 fingerprint requests of a freeproxy, adjusted by subtracting the timeout duration. The default value is60000(1 minute).FREEPROXIES_REFRESH_EMPTY_DELAY: Delay in milliseconds to wait if there is no freeproxy to refresh. The default value is1000(1 second).FREEPROXIES_REFRESH_ERROR_DELAY: Delay in milliseconds to wait if freeproxy's refresh triggers an error. The default value is2000(2 seconds).
Metrics
MASTER_REFRESH_METRICS_DELAY: Delay interval in milliseconds to send traffic metrics of the Master to the Commander API. The default value is10000(10 seconds).METRICS_REFRESH_REFRESH_DELAY: Delay interval in milliseconds to calculate metrics of all projects. The default value is10000(10 seconds).
Tasks
TASKS_REFRESH_EMPTY_DELAY: Delay in milliseconds to wait if there is no task to execute. The default value is1000(1 second).TASKS_REFRESH_ERROR_DELAY: Delay in milliseconds to wait if task's execution triggers an error. The default value is2000(2 seconds).
Stopping Scrapoxy
CLEAR_AT_SHUTDOWN: Clear all proxies at shutdown. Values are1for yes and0for no. This is useful when you run Scrapoxy locally. The default value is0.STOPPING_DELAY: Delay in milliseconds between queries count of active proxies during shutdown. It is only available ifCLEAR_AT_SHUTDOWNis active. The default value is2000(2 seconds).