Commit Graph

231 Commits

Author SHA1 Message Date
Jon Chambers 7bf8650d59 Un-manage FaultTolerantRedisCluster so it shuts down at JVM shutdown instead of Jetty shutdown. 2020-07-30 18:37:38 -04:00
Ehren Kret 7cb24dd96d Add environment tag to datadog metric reporting 2020-07-30 18:04:16 -04:00
Ehren Kret dee040318a Add the host tag to datadog metric reporting 2020-07-30 18:04:16 -04:00
Jon Chambers baf563e46d Temporarily disarm the actual persisting part of the message persister. 2020-07-30 17:12:37 -04:00
Jon Chambers e10246f10b Use Dropwizard timers/histograms for persister metrics. 2020-07-30 14:27:06 -04:00
Jon Chambers a9dfd88671 Start the clustered message persister at application startup. 2020-07-30 12:32:35 -04:00
Jon Chambers beac73b6c8 Add a cluster-capable message persister 2020-07-30 11:39:14 -04:00
Jon Chambers f9f93c77e2 Use UUIDs instead of phone numbers as account identifiers in clustered message cache 2020-07-30 11:39:14 -04:00
Jon Chambers 6fc1b4c6c0 Add a cluster-backed message cache. 2020-07-30 11:39:14 -04:00
Jon Chambers 639898ec07 Expand Experiment to deal with async suppliers and Optionals. 2020-07-30 11:39:14 -04:00
Jon Chambers 3d3790fdbc Add binary execution methods to ClusterLuaScript. 2020-07-30 11:39:14 -04:00
Jon Chambers 69c8968cb0 Add byte-array-based methods to FaultTolerantRedisCluster. 2020-07-30 11:39:14 -04:00
Jon Chambers aa25fc7901 Fix UsernamesManager metric/logger names. 2020-07-29 11:00:29 -04:00
Jon Chambers 4aba493ee2 Fix the key used for database crawler workers. 2020-07-29 10:58:06 -04:00
Jon Chambers b9cfac5934 Introduce additional metric aggregators. 2020-07-28 15:11:51 -04:00
Brian Acton f8e97fcc32 revise 12 hour active user fudge to 8 hours for better continuity of data from a month ago 2020-07-28 11:09:41 -07:00
Jon Chambers 7f8f2641f6 Simplify registration lock counting by avoiding inactive accounts. 2020-07-28 11:48:20 -04:00
Jon Chambers 022dbb606f Count registration lock versions when crawling the account database. 2020-07-28 11:48:20 -04:00
Jon Chambers fea72b190d Record message content size as a dimensioned distribution. 2020-07-28 11:47:56 -04:00
Jon Chambers eea073f882 Decommission the old cache. 2020-07-28 10:29:28 -04:00
Jon Chambers fc1d88f5bb Read exclusively from the cache cluster. 2020-07-27 15:11:40 -04:00
Jon Chambers acbe410e0b Remove a metric aggregator. 2020-07-27 12:50:49 -04:00
Ehren Kret 89bafea61f Move SMS strings to configuration 2020-07-27 11:23:21 -05:00
Jon Chambers 33a0c4a9ae Use first party metric aggregator libraries where possible. 2020-07-24 17:21:56 -04:00
Jon Chambers 4cc5999f05 Configure additional metric aggregators. 2020-07-23 13:31:19 -04:00
Jon Chambers 0fbf31ec98 Clear each cluster node individually. 2020-07-22 11:12:21 -04:00
Jon Chambers db9b7ca447 Fix slot assignment when building a cluster for tests. 2020-07-22 11:04:10 -04:00
Jon Chambers eecc71c77f
Revert batch message storage. (#95) 2020-07-20 16:28:32 -04:00
Jon Chambers 5f898a9071 Measure inserted message batch size. 2020-07-20 10:30:29 -04:00
Jon Chambers a08f21336a Be explicit about transaction management. 2020-07-20 10:30:29 -04:00
Jon Chambers 215125de26 Update tests. 2020-07-20 10:30:29 -04:00
Jon Chambers dfa94eac41 Store messages in batches. 2020-07-20 10:30:29 -04:00
Jon Chambers 247d869a5c De-randomize message tests to minimize flakiness. 2020-07-14 18:46:39 -04:00
Ehren Kret b9b6e1818f Rename SenderIdSelector to SenderIdSupplier per code review discussion 2020-07-14 10:53:48 -05:00
Ehren Kret a7968ccc3c Address code review comments 2020-07-14 10:53:48 -05:00
Ehren Kret b7e0e5a356 Create a strategy class to decide which sender id to use
The rules around selecting sender ids can get complicated with some
countries not supporting it and others requiring pre-registration that
may result in having a different sender id for that country than
others. This strategy class handles the logic of dealing with this
expanded configuration and applying the appropriate sender id or none
when it's not appropriate to do so at all.
2020-07-14 10:53:48 -05:00
Brian Acton e3aecb2aa9 apply a 12 hour fudge to daily user counting to account for last seen timestamp fuzzing 2020-07-09 17:43:12 -07:00
Jon Chambers 116ab83b95 Include a PushType header when sending APNs notifications. 2020-07-09 16:12:20 -04:00
Jon Chambers c5d0d4acd0 Revert "Move rate limiter logic to Lua scripts"
This reverts commit b585c6676d.
2020-07-09 12:30:25 -04:00
Jon Chambers 06190286ec Remove temporary circuit breaker suppression. 2020-07-07 16:33:05 -04:00
Jon Chambers 3bca856e87 Remove a pair of spurious SET calls in the rate limiter script. 2020-07-07 16:33:05 -04:00
Jon Chambers b3a778b89a Temporarily catch and log all script execution exceptions to avoid opening the breaker. 2020-07-07 15:17:25 -04:00
Jon Chambers dcb11f7606 Log errors from experiments. 2020-07-07 15:17:25 -04:00
Jon Chambers 933ce42d5a Test rate limiters against a real cluster. 2020-07-07 15:17:25 -04:00
Ehren Kret 6c1ba957bd Ensure the default alphaId configuration is an empty list rather than null 2020-07-07 10:17:40 -05:00
Ehren Kret e021286eee Add configuration by country for sending from alpha IDs 2020-07-07 10:17:40 -05:00
Ehren Kret 0ee7a66033
Keep trying ports until you get one lower than 55535 (#83)
* Keep trying ports until you get one lower than 55535

* Rename method and change to do...while

* Limit attempts to 11,000 to find an open redis cluster port
2020-07-07 10:12:31 -05:00
Jon Chambers 42c797ee97 Set the default log level for tests to WARN. 2020-07-07 11:05:39 -04:00
Jon Chambers b585c6676d
Move rate limiter logic to Lua scripts 2020-07-06 10:10:13 -04:00
Jon Chambers f5ddb0f1f8 Test ClusterLuaScript against a real Redis cluster. 2020-07-02 18:58:30 -04:00
Jon Chambers ef97f9e738 Revert "Temporarily suspend execution of the "unlock" script."
This reverts commit 6aecd8d44a.
2020-07-02 18:58:30 -04:00
Jon Chambers 26a03b55de Un-reinvent the clustered script execution wheel. 2020-07-02 18:58:30 -04:00
Jon Chambers b93a16abae Honor the step size set in the micrometer config. 2020-07-02 11:40:41 -04:00
Jon Chambers ff2783d434 Fixed a goof where we were mirroring a write to the wrong key in the new cache cluster. 2020-07-02 11:40:27 -04:00
Ehren Kret 25a5a8db68
Set avatar to null on Account when request is false (#78) 2020-06-29 15:53:31 -05:00
Jon Chambers a68d91b54c Resolve some test flakiness by adding a deterministic "wait" mechanism. (SERVER-86) 2020-06-29 12:24:25 -04:00
Jon Chambers 88ec3a5751 Add a counter for dead letter events. 2020-06-26 09:00:11 -04:00
Jon Chambers 734dc2e37a Don't block the Redis instance when clearing the cache. 2020-06-19 10:52:18 -04:00
Jon Chambers 6aecd8d44a Temporarily suspend execution of the "unlock" script. 2020-06-17 22:27:02 -04:00
Jon Chambers bbf5e1fa78 Use the UA string from websocket upgrade requests if available. 2020-06-17 15:40:18 -04:00
Jon Chambers 7454e55693 Write synchronously to the cache cluster. 2020-06-17 15:38:56 -04:00
Jon Chambers c745fe7778 Fix a poorly-mirrored cache delete operation. 2020-06-17 15:35:46 -04:00
Jon Chambers 6adcebb247 Return to just using counters instead of timers for measuring experiment outcomes. 2020-06-17 15:34:02 -04:00
Jon Chambers 38f9b8f3dd Make write operations in `AccountDatabaseCrawlerCache` synchronous. 2020-06-17 10:05:43 -04:00
Jon Chambers 7faf143a97 Subdivide the account database crawler cache experiment and add logging to track down lingering disagreements. 2020-06-17 09:23:40 -04:00
Jon Chambers 17cfd4924c Fixed a poorly-mirrored write operation to the new cluster. 2020-06-16 16:46:41 -04:00
Jon Chambers a0bebca1e6 Extend Experiment to report more detail when results don't match. 2020-06-16 16:46:41 -04:00
Jon Chambers 75cbfa2898 Mirror unlock-via-script calls to the cache cluster. 2020-06-16 16:46:41 -04:00
Jon Chambers 58a8ed1588 Add a cluster-friendly version of LuaScript. 2020-06-16 16:46:41 -04:00
Jon Chambers e032f8df59 Add a command for clearing the cache cluster. 2020-06-16 16:46:41 -04:00
Jon Chambers b16e37d80a Record a histogram of incoming message list sizes. 2020-06-12 14:43:50 -04:00
Jon Chambers c17cc07b73 Instrument BlockingThreadPoolExecutor. 2020-06-12 14:43:50 -04:00
Jon Chambers 6f767a72a7 Add a timer for the private sendMessage method. 2020-06-12 14:43:50 -04:00
Jon Chambers 11196436e9 Time rate limiter validation calls. 2020-06-12 14:43:50 -04:00
Jon Chambers 9afc433db4 Record exceptions associated with server responses. 2020-06-11 22:08:07 -04:00
Jon Chambers f701e3d834 Record distributions of timer values; stop recording error causes. 2020-06-11 11:50:36 -04:00
Jon Chambers 4c623ca3c5 Compare Redis reads using Lettuce's synchronous path. 2020-06-11 11:50:36 -04:00
Jon Chambers 0671f05c05 Introduce experiment comparison methods for suppliers. 2020-06-11 11:50:36 -04:00
Jon Chambers 0713da7393 Record experiment results with a timer instead of a counter. 2020-06-11 11:50:36 -04:00
Jon Chambers 05955d0483 Check for null header values before trying to iterate through them. 2020-06-09 15:45:32 -04:00
Jon Chambers 28c765bd9a Add an in-app-context test for websocket metrics. 2020-06-09 15:45:32 -04:00
Ehren Kret 8287317be7 Add account device ID to the prekey rate limiter
This limits prekey fetching per device on an account instead of on an
account level.
2020-06-09 10:20:10 -07:00
Jon Chambers ec858b2d4c
Set a timeout for Redis cluster operations and shut down the cluster as part of service shutdown 2020-06-07 18:27:57 -04:00
Jon Chambers 47ece983d2 Added a Redis cluster health check. 2020-06-07 18:27:11 -04:00
Jon Chambers 52310b5dd9 Compare results of reads from old and new Redis caches. 2020-06-07 18:27:11 -04:00
Jon Chambers c2a4a2778e Introduce the Experiment class to compare results from parallel systems. 2020-06-07 18:27:11 -04:00
Jon Chambers 1db5977e80 Mirror username deletes unconditionally. 2020-06-07 18:27:11 -04:00
Jon Chambers 1b5dc0e434 Fixed a potential issue where locks could get out of sync between Redis instances. 2020-06-07 18:27:11 -04:00
Moxie Marlinspike f07f02d866 Deliver upgrade link to stale clients 2020-06-06 18:20:55 -07:00
Jon Chambers 1388103919 Mirror writes to the cache cluster. 2020-06-06 20:37:48 -04:00
Jon Chambers fe1054d58a Introduce a Lettuce-based fault-tolerant Redis cluster accessor. 2020-06-06 20:37:48 -04:00
Jon Chambers ba6ac778fc Update to Pushy v0.14.1. 2020-06-05 12:21:56 -04:00
Jon Chambers 228ffcbfce Differentiate between websocket and "boring" HTTP traffic. 2020-05-28 12:52:49 -04:00
Jon Chambers f18ab9e5cc Measure traffic from websockets. 2020-05-28 12:52:49 -04:00
Jon Chambers 06c82ee87d Celebrate the diversity of UA strings when generating tags for metrics. 2020-05-27 19:35:42 -04:00
Jon Chambers 9ba5ee8043 Move UA tag extraction into its own utility class. 2020-05-27 19:35:42 -04:00
Ehren Kret eede4e50ca
Use hashed UUID to spread last seen updates over a full day (#40) 2020-05-26 13:38:52 -07:00
Jon Chambers aa10f63d9f Add the timestamp using the `add` method. 2020-05-22 17:39:25 -04:00
Jon Chambers a25af36e32 Include timestamps in all server-to-client websocket messages. 2020-05-22 15:13:39 -04:00
Jon Chambers 817f057927 Inject timestamps into responses. 2020-05-22 15:13:39 -04:00