Commit Graph

286 Commits

Author SHA1 Message Date
Jon Chambers a709a3bcc0 Remove a candidate metric provider. 2020-08-20 15:40:56 -04:00
Jon Chambers 34bf5112e0 Drop TimeProvider. 2020-08-20 15:40:24 -04:00
Jon Chambers bfe18d1d28 Re-nerf the clustered message persister. 2020-08-20 15:38:09 -04:00
Jon Chambers 6a76afc20d Add a test to make sure the persister is respecting persist delays. 2020-08-20 15:38:09 -04:00
Jon Chambers 9c469c2f96 Base persister tests on a real Redis cluster. 2020-08-20 15:38:09 -04:00
Jon Chambers 2ab42f3dd6 Refine and expand clustered message cache metrics. 2020-08-19 11:39:05 -04:00
Jon Chambers af34b43a8d Reactivate the message notification experiment. 2020-08-19 11:39:05 -04:00
Jon Chambers 0f71cc7864 Rename metrics associated with cluster circuit breakers for clarity. 2020-08-18 17:59:00 -04:00
Jon Chambers df90de3a5f Change default Lettuce command timeout to 10s. 2020-08-18 16:21:42 -04:00
Jon Chambers 42ea7a9814 Revert Lettuce connection pooling. 2020-08-18 16:21:42 -04:00
Jon Chambers c683cbdb2d Time Redis operations. 2020-08-18 12:20:12 -04:00
Jon Chambers d243b73678 Make Lettuce connection pools configurable. Double the default size. 2020-08-18 12:20:12 -04:00
Jon Chambers dc28d063aa Reactivate the explicit client presence experiment. 2020-08-17 11:34:27 -04:00
Jon Chambers bb6045c1d0 Disarm the client presence manager experiment. 2020-08-15 20:23:05 -04:00
Jon Chambers f1a74b5939 Disarm new message keyspace notifications. 2020-08-15 20:23:05 -04:00
Jon Chambers 6fb9038af1 Move to a synchronous, pooled connection model for Redis clusters. 2020-08-14 17:15:56 -04:00
Jon Chambers 27f721a1f5 Update to resilience4j 1.5.0. 2020-08-14 17:15:56 -04:00
Jon Chambers 5717dc294e Combine the read/write breakers for Redis clusters. 2020-08-14 17:15:56 -04:00
Jon Chambers ae0f8df11b Break out FaultTolerantPubSubConnection as its own thing so different use cases can have their own subscription space. 2020-08-14 17:15:56 -04:00
Jon Chambers 77460ba502 Remove keyspace notification configuration checks because AWS doesn't support `CONFIG GET`. 2020-08-13 15:32:25 -04:00
Jon Chambers f8235da4d8 Fix an issue where the queue for a thread pool was not bounded. 2020-08-13 12:46:11 -04:00
Jon Chambers 8d3316ccd6 Listen for new messages via keyspace notifications. 2020-08-13 12:17:04 -04:00
Jon Chambers 2c29f831e8 Add an explicit client presence system. 2020-08-13 10:56:26 -04:00
Jon Chambers 9457325119 Add pub/sub affordances to FaultTolerantRedisCluster. 2020-08-13 10:56:26 -04:00
Jon Chambers 189f8afcc9 Warm up the test cluster before running tests to avoid transient startup jitters. 2020-08-13 10:56:26 -04:00
Jon Chambers 9699b67510 Record the size of outgoing message lists. 2020-08-12 16:57:10 -04:00
Jon Chambers d60633a46c Add a meter for the number of messages we send via websocket connections. 2020-08-12 16:57:10 -04:00
Jon Chambers 0fcf28e7e7 Use the MessagesManager to actually persist messages. 2020-08-11 15:50:22 -04:00
Jon Chambers 5fad8f74b1 Factor MessagePersister into its own class. 2020-08-11 15:50:22 -04:00
Jon Chambers e35e34d2e0 Move operation-mirroring logic to MessagesManager. 2020-08-11 15:50:22 -04:00
Jon Chambers 31a215d4d6 Use "global." instead of "g." as the prefix for global config options. 2020-08-11 11:55:35 -04:00
Jon Chambers 30948de13d Update a metric provider dependency and remove a workaround for an upstream issue. 2020-08-11 11:02:38 -04:00
Ehren Kret b97158bf7b
Create global remote config controllable in the signal server configuration (#127)
* Add global config controller through file rather than database

* Do no permit attempting to set or delete global config entries
2020-08-10 16:31:15 -05:00
Jon Chambers 6646be8d94 Make CpuUsageGauge a CachedGauge. 2020-08-10 12:56:37 -04:00
Jon Chambers 647a2aea64 Cache a reference to the OS management bean to avoid repeated lookups. 2020-08-10 12:56:37 -04:00
Jon Chambers 58e58ce51c Remove a candidate metric provider. 2020-08-10 11:03:20 -04:00
Ehren Kret 4b7e48d3ec
Override default ingestion URI for SignalFx (#131) 2020-08-07 15:29:42 -05:00
Ehren Kret 0e074d3a5a Copy SignalFxMeterRegistry into a new class to get better logging 2020-08-07 16:01:56 -04:00
Ehren Kret ea00224e7f
Add support for reporting metrics to signalfx (#129) 2020-08-07 11:10:31 -05:00
Jon Chambers 38293efe75 Keep a running count of the number of open websockets. 2020-08-06 16:07:34 -04:00
Jon Chambers 3286c5e174 Disable Redis persistence for tests. 2020-08-06 11:22:51 -04:00
Jon Chambers e0f8a28f38 Close connections before closing the whole cluster client. 2020-08-06 11:22:31 -04:00
Jon Chambers bf1b00b163 Drop a spurious RedisClusterClient. 2020-08-06 11:22:31 -04:00
Ehren Kret 4fa3a136ad
Remove arbitrary SMS and add a NANPA message service (#123)
* Remove arbitrary SMS code

This code has run its course and is no longer needed for now.

* Add elements to sample config that were left out

* Add a messaging service for NANPA

* Fixup sample config capitalization
2020-08-05 13:35:11 -05:00
Jon Chambers 178a6bd66e Log the top-level exception name and message when crawling badness happens. 2020-08-05 11:23:16 -04:00
Ehren Kret 57e1339230
Further restrict user agent pattern matching (#120)
* Further restrict user agent pattern matching

* Add static qualifier to method
2020-08-04 12:58:16 -05:00
Jon Chambers 4144423227 Publish percentiles for Micrometer distributions/timers. 2020-08-04 10:58:59 -04:00
Jon Chambers 4d03514142 Add a command for clearing the messages cache cluster. 2020-08-04 10:58:41 -04:00
Jon Chambers 0bc5566976 Mirror delete-after-persist operations to the clustered message cache. 2020-08-04 10:58:41 -04:00
Jon Chambers 925567add5 Actually "plug in" the reglock counter. 2020-08-03 15:43:33 -04:00
Jon Chambers ad97731d46 Reduce the maximum number of versions in play to 1,000. 2020-08-03 15:42:15 -04:00
Jon Chambers 40684a93a2 Restrict user-agent version matching to a more confined space. 2020-08-03 15:42:15 -04:00
Jon Chambers f3b644ceb8 Update the push latency manager to use UUIDs and a Redis cluster. 2020-08-03 15:36:02 -04:00
Jon Chambers 901ba6e87f Added a push latency manager. 2020-08-03 15:36:02 -04:00
Jon Chambers 76389bd584 Clear would-be-persisted messages from the cache cluster, but don't store them to the database. 2020-07-30 19:14:39 -04:00
Jon Chambers 7bf8650d59 Un-manage FaultTolerantRedisCluster so it shuts down at JVM shutdown instead of Jetty shutdown. 2020-07-30 18:37:38 -04:00
Ehren Kret 7cb24dd96d Add environment tag to datadog metric reporting 2020-07-30 18:04:16 -04:00
Ehren Kret dee040318a Add the host tag to datadog metric reporting 2020-07-30 18:04:16 -04:00
Jon Chambers baf563e46d Temporarily disarm the actual persisting part of the message persister. 2020-07-30 17:12:37 -04:00
Jon Chambers e10246f10b Use Dropwizard timers/histograms for persister metrics. 2020-07-30 14:27:06 -04:00
Jon Chambers a9dfd88671 Start the clustered message persister at application startup. 2020-07-30 12:32:35 -04:00
Jon Chambers beac73b6c8 Add a cluster-capable message persister 2020-07-30 11:39:14 -04:00
Jon Chambers f9f93c77e2 Use UUIDs instead of phone numbers as account identifiers in clustered message cache 2020-07-30 11:39:14 -04:00
Jon Chambers 6fc1b4c6c0 Add a cluster-backed message cache. 2020-07-30 11:39:14 -04:00
Jon Chambers 639898ec07 Expand Experiment to deal with async suppliers and Optionals. 2020-07-30 11:39:14 -04:00
Jon Chambers 3d3790fdbc Add binary execution methods to ClusterLuaScript. 2020-07-30 11:39:14 -04:00
Jon Chambers 69c8968cb0 Add byte-array-based methods to FaultTolerantRedisCluster. 2020-07-30 11:39:14 -04:00
Jon Chambers aa25fc7901 Fix UsernamesManager metric/logger names. 2020-07-29 11:00:29 -04:00
Jon Chambers 4aba493ee2 Fix the key used for database crawler workers. 2020-07-29 10:58:06 -04:00
Jon Chambers b9cfac5934 Introduce additional metric aggregators. 2020-07-28 15:11:51 -04:00
Brian Acton f8e97fcc32 revise 12 hour active user fudge to 8 hours for better continuity of data from a month ago 2020-07-28 11:09:41 -07:00
Jon Chambers 7f8f2641f6 Simplify registration lock counting by avoiding inactive accounts. 2020-07-28 11:48:20 -04:00
Jon Chambers 022dbb606f Count registration lock versions when crawling the account database. 2020-07-28 11:48:20 -04:00
Jon Chambers fea72b190d Record message content size as a dimensioned distribution. 2020-07-28 11:47:56 -04:00
Jon Chambers eea073f882 Decommission the old cache. 2020-07-28 10:29:28 -04:00
Jon Chambers fc1d88f5bb Read exclusively from the cache cluster. 2020-07-27 15:11:40 -04:00
Jon Chambers acbe410e0b Remove a metric aggregator. 2020-07-27 12:50:49 -04:00
Ehren Kret 89bafea61f Move SMS strings to configuration 2020-07-27 11:23:21 -05:00
Jon Chambers 33a0c4a9ae Use first party metric aggregator libraries where possible. 2020-07-24 17:21:56 -04:00
Jon Chambers 4cc5999f05 Configure additional metric aggregators. 2020-07-23 13:31:19 -04:00
Jon Chambers 0fbf31ec98 Clear each cluster node individually. 2020-07-22 11:12:21 -04:00
Jon Chambers db9b7ca447 Fix slot assignment when building a cluster for tests. 2020-07-22 11:04:10 -04:00
Jon Chambers eecc71c77f
Revert batch message storage. (#95) 2020-07-20 16:28:32 -04:00
Jon Chambers 5f898a9071 Measure inserted message batch size. 2020-07-20 10:30:29 -04:00
Jon Chambers a08f21336a Be explicit about transaction management. 2020-07-20 10:30:29 -04:00
Jon Chambers 215125de26 Update tests. 2020-07-20 10:30:29 -04:00
Jon Chambers dfa94eac41 Store messages in batches. 2020-07-20 10:30:29 -04:00
Jon Chambers 247d869a5c De-randomize message tests to minimize flakiness. 2020-07-14 18:46:39 -04:00
Ehren Kret b9b6e1818f Rename SenderIdSelector to SenderIdSupplier per code review discussion 2020-07-14 10:53:48 -05:00
Ehren Kret a7968ccc3c Address code review comments 2020-07-14 10:53:48 -05:00
Ehren Kret b7e0e5a356 Create a strategy class to decide which sender id to use
The rules around selecting sender ids can get complicated with some
countries not supporting it and others requiring pre-registration that
may result in having a different sender id for that country than
others. This strategy class handles the logic of dealing with this
expanded configuration and applying the appropriate sender id or none
when it's not appropriate to do so at all.
2020-07-14 10:53:48 -05:00
Brian Acton e3aecb2aa9 apply a 12 hour fudge to daily user counting to account for last seen timestamp fuzzing 2020-07-09 17:43:12 -07:00
Jon Chambers 116ab83b95 Include a PushType header when sending APNs notifications. 2020-07-09 16:12:20 -04:00
Jon Chambers c5d0d4acd0 Revert "Move rate limiter logic to Lua scripts"
This reverts commit b585c6676d.
2020-07-09 12:30:25 -04:00
Jon Chambers 06190286ec Remove temporary circuit breaker suppression. 2020-07-07 16:33:05 -04:00
Jon Chambers 3bca856e87 Remove a pair of spurious SET calls in the rate limiter script. 2020-07-07 16:33:05 -04:00
Jon Chambers b3a778b89a Temporarily catch and log all script execution exceptions to avoid opening the breaker. 2020-07-07 15:17:25 -04:00
Jon Chambers dcb11f7606 Log errors from experiments. 2020-07-07 15:17:25 -04:00
Jon Chambers 933ce42d5a Test rate limiters against a real cluster. 2020-07-07 15:17:25 -04:00
Ehren Kret 6c1ba957bd Ensure the default alphaId configuration is an empty list rather than null 2020-07-07 10:17:40 -05:00