Commit Graph

2810 Commits

Author SHA1 Message Date
Jon Chambers 2ba36ee04c Add a gauge for worker thread liveness. 2020-10-03 11:43:34 -04:00
Jon Chambers fc05529574 Let MessagePersister manage its own worker thread. 2020-10-03 11:43:34 -04:00
Jon Chambers 07d24f487a Don't re-register metrics for shared circuit breakers. 2020-10-02 15:05:00 -04:00
Jon Chambers 811acdb7f5 Use separate namespaces for Redis breaker/retry metrics. 2020-10-02 10:57:05 -04:00
Jon Chambers a7266364d1 Refactor peer pruning to be more retry-friendly. 2020-10-01 17:17:07 -04:00
Jon Chambers e83b41dc01 Reduce default Redis cluster command timeout to 3 seconds. 2020-10-01 17:17:07 -04:00
Jon Chambers 76665dd56e Retry Redis commands that time out. 2020-10-01 17:17:07 -04:00
Jon Chambers 2d42b478ba Consolidate cluster and pub/sub circuit breakers. 2020-10-01 17:17:07 -04:00
Jon Chambers 885fa6beae Add tests for Device#isEnabled. 2020-10-01 12:54:35 -04:00
Jon Chambers 65cdd5fcbe Drop the 365-day check when deciding if an account is enabled. 2020-10-01 12:54:35 -04:00
Jon Chambers 4302e19aba Register a UUID argument factory for the messages database. 2020-10-01 11:06:43 -04:00
Jon Chambers 0c6f05f34a Add a (failing!) test for sending a sealed-sender message after a non-sealed-sender message. 2020-10-01 11:06:43 -04:00
Jon Chambers 8040c285cd Include stack traces when reporting persistence issues. 2020-09-30 11:47:16 -04:00
Jon Chambers ada454f56f Add a meter for persisting individual messages. 2020-09-30 10:39:56 -04:00
Jon Chambers 57d2ef8740 Return queues to the "to persist" list if something goes wrong during persistence. 2020-09-30 10:39:56 -04:00
Jon Chambers a97e0982e3 Add an integration test for message persistence. 2020-09-30 10:39:56 -04:00
Jon Chambers eaa2060d84 Fix an incorrect locking key and some previously-suppressed lock contention issues. 2020-09-30 10:39:56 -04:00
Jon Chambers 3e02c574e7 Log exceptions when persisting messages. 2020-09-30 10:39:56 -04:00
Jon Chambers c7230ccbb0 Remove messages from the cache in bulk. 2020-09-29 10:58:02 -04:00
Jon Chambers fc71ced660 Persist messages in batches. 2020-09-29 10:58:02 -04:00
Jon Chambers 6041a9d094 Make exit conditions slightly more conservative. 2020-09-29 10:58:02 -04:00
Jon Chambers 599cd766e1 Let Dropwizard manage persister thread lifecycles. 2020-09-29 10:58:02 -04:00
Alan Evans e64c8007c0
Detect GV2 capability in non-gcm Android devices 2020-09-28 15:54:10 -04:00
Jon Chambers 9339823e84 Add temporary metrics to monitor the ratio of enabled/disabled accounts. 2020-09-28 15:33:52 -04:00
Jon Chambers e6d4620af1 Only allow linking desktop clients if they support the third-generation GV2 capability. 2020-09-25 17:08:32 -04:00
Jon Chambers 656e6db846 Only consider desktop devices GV2-capable if they send the third-gen GV2 capability. 2020-09-25 17:08:32 -04:00
Jon Chambers 30474e3a2b Add a test for message ordering. 2020-09-25 11:41:58 -04:00
Jon Chambers 460bd98f1b Add metrics for messages missing GUIDs. 2020-09-25 11:41:22 -04:00
Jon Chambers a553eba574 Add an API endpoint for deleting accounts. 2020-09-25 11:39:17 -04:00
Jon Chambers 61f515670c Add plumbing for deleting accounts and all associated data. 2020-09-25 11:39:17 -04:00
Jon Chambers 789af0f8a6 Add support for deleting keys associated with an account. 2020-09-25 11:39:17 -04:00
Jon Chambers 86fae58c96 Add support for deleting account entities from the database. 2020-09-25 11:39:17 -04:00
Jon Chambers c54d3abe47 Check for the second-gen GV2 capability when linking devices. 2020-09-24 19:04:02 -04:00
Jon Chambers 6fe511eb50 Fix a bad size check when loading stored messages. 2020-09-23 18:02:33 -04:00
Jon Chambers 17d18b22c7 Drop pub/sub sending logic from WebsocketSender. 2020-09-23 14:51:02 -04:00
Jon Chambers 66a04ed730 Don't explicitly notify clients when messages get persisted. 2020-09-23 14:51:02 -04:00
Jon Chambers 7e14a0bc30 Drop pub/sub operations from WebsocketConnection. 2020-09-23 14:51:02 -04:00
Jon Chambers 77de0f86dc Require desktop clients to send the new gv2-2 capability flag. 2020-09-23 12:05:58 -04:00
Jon Chambers 3b4bc9163a Untangle thread pool names, tweak sizes, and add instrumentation. 2020-09-22 10:21:33 -04:00
Jon Chambers e146135bd1 Don't attempt to send more messages if sending failed for any reason. 2020-09-22 10:21:33 -04:00
Jon Chambers e9e18afb4a Add a (failing) integration test demonstrating an infinite loop. 2020-09-22 10:21:33 -04:00
Jon Chambers 62c31eb202 Revert "Revert keyspace delivery for all messages"
This reverts commit 4dc49604b6.
2020-09-22 10:21:33 -04:00
Jon Chambers 1eacee85ae Count how many iOS users set the old GV2 capability flag. 2020-09-21 18:58:07 -04:00
Jon Chambers 5986145282 Add a second-generation GV2 capability and ignore the old capability for iOS devices. 2020-09-21 18:57:53 -04:00
Jon Chambers b134a69a28 Record the number of authentications for users with/without GV2 support. 2020-09-21 15:42:13 -04:00
Jon Chambers 83f9eacac4 Refactor UserAgentTagUtil to parse UA strings with UserAgentUtil. 2020-09-21 12:24:08 -04:00
Jon Chambers baab6b951b Add a general utility class for parsing user-agent strings. 2020-09-21 12:24:08 -04:00
Jon Chambers b041fbe3ec Add semver4j as a dependency. 2020-09-21 12:24:08 -04:00
Jon Chambers 903a1bec91 Reject (eventually) oversize messages. 2020-09-17 17:07:20 -04:00
Jon Chambers ebc3a251b7 Drop the UUID addressing capability flag entirely. 2020-09-14 15:36:29 -04:00
Jon Chambers a567f4a6de Don't check UUID capability when blocking capability downgrades. 2020-09-14 15:36:29 -04:00
Jon Chambers 4dc49604b6
Revert keyspace delivery for all messages
* Revert "Send all messages via keyspace notifications when a feature flag is enabled."

This reverts commit fadcf62166.

* Revert "Consolidate semaphore release logic."

This reverts commit c02b255766.

* Revert "Represent stored message state as an enumeration rather than a collection of booleans."

This reverts commit 89788fa665.

* Revert "Refactor: collapse state into semaphores/atomic booleans."

This reverts commit a052e2ee8f.

* Revert "Refactor: move sendNextMessagePage into its own method."

This reverts commit 158e5004b7.

* Revert "Avoid querying the database if we think all new messages are in the cache."

This reverts commit 6f9ff3be37.

* Revert "Query for more stored messages if an update happens while we're already processing a batch."

This reverts commit f766c57743.

* Revert "Only send the "queue cleared" message once per websocket session."

This reverts commit 8f53152c3e.

* Revert "Let processStoredMessages handle requery logic."

This reverts commit 7bbc88d716.

* Revert "Only allow one thread to process stored messages at a time."

This reverts commit 68256d2343.
2020-09-14 15:35:10 -04:00
Jon Chambers fadcf62166 Send all messages via keyspace notifications when a feature flag is enabled. 2020-09-11 13:12:17 -04:00
Jon Chambers c02b255766 Consolidate semaphore release logic. 2020-09-11 13:12:17 -04:00
Jon Chambers 89788fa665 Represent stored message state as an enumeration rather than a collection of booleans. 2020-09-11 13:12:17 -04:00
Jon Chambers a052e2ee8f Refactor: collapse state into semaphores/atomic booleans. 2020-09-11 13:12:17 -04:00
Jon Chambers 158e5004b7 Refactor: move sendNextMessagePage into its own method. 2020-09-11 13:12:17 -04:00
Jon Chambers 6f9ff3be37 Avoid querying the database if we think all new messages are in the cache. 2020-09-11 13:12:17 -04:00
Jon Chambers f766c57743 Query for more stored messages if an update happens while we're already processing a batch. 2020-09-11 13:12:17 -04:00
Jon Chambers 8f53152c3e Only send the "queue cleared" message once per websocket session. 2020-09-11 13:12:17 -04:00
Jon Chambers 7bbc88d716 Let processStoredMessages handle requery logic. 2020-09-11 13:12:17 -04:00
Jon Chambers 68256d2343 Only allow one thread to process stored messages at a time. 2020-09-11 13:12:17 -04:00
Ehren Kret f88c440c48
Automatically retry when Twilio returns unreachable (#190)
* Parse and log the Twilio error code

* Automatically retry without sender ID when Twilio returns unreachable

* Remove attempt count and pass around whether or not sender id was used
2020-09-10 13:58:39 -05:00
Jon Chambers cfa56ba6d4 Remove the "send online messages via keyspace notifications" feature flag. 2020-09-10 10:41:20 -04:00
Jon Chambers 2c6b646d87 Enforce no capability downgrade on device verification 2020-09-09 16:05:00 -04:00
Jon Chambers e7572094b5 Require all enabled devices to support GV2. 2020-09-09 16:05:00 -04:00
Jon Chambers 5e34823a49 Optionally send online-only messages via keyspace notifications. 2020-09-09 14:42:09 -04:00
Jon Chambers fdef21a871 Record and listen for ephemeral messages in a separate queue. 2020-09-09 14:42:09 -04:00
Jon Chambers d40cff8a99 Revert "Add a system for storing, retrieving, and notifying listeners about ephemeral (online) messages."
This reverts commit 06754d6158.
2020-09-08 15:55:09 -04:00
Jon Chambers 8927e45ded Revert "Optionally send online-only messages via keyspace notifications."
This reverts commit 12fe28d8ab.
2020-09-08 15:55:09 -04:00
Jon Chambers 1a93df92d4 Replace DeliveryStatus with a simple boolean. 2020-09-08 11:29:33 -04:00
Jon Chambers 12fe28d8ab Optionally send online-only messages via keyspace notifications. 2020-09-08 11:19:55 -04:00
Jon Chambers 06754d6158 Add a system for storing, retrieving, and notifying listeners about ephemeral (online) messages. 2020-09-08 11:14:42 -04:00
Jon Chambers 1d5087374e Jettison UUID-or-E164 plumbing in favor of UUID-only. 2020-09-08 09:30:47 -04:00
Jon Chambers 8356264fe0 Rename RedisClusterMessagesCache and related classes to just MessagesCache. 2020-09-08 09:30:47 -04:00
Jon Chambers 18ecd748dd Entirely discard the old message cache machinery. 2020-09-08 09:30:47 -04:00
Jon Chambers e324f27655 Stop sending/processing CONNECTED pub/sub messages. 2020-09-03 13:52:43 -04:00
Jon Chambers afd645fb11 Retrieve messages using commands available in Redis 3. 2020-09-03 13:31:55 -04:00
Jon Chambers 5b42593fbb Persist messages one page at a time. 2020-09-03 12:08:46 -04:00
Jon Chambers 25f3c6a548 Drop our dependency on commons-pool. 2020-09-03 11:05:10 -04:00
Jon Chambers 5c04f2634a Use a dedicated executor service for dispatching keyspace notifications. 2020-09-03 11:04:48 -04:00
Jon Chambers ad01610d1e Rely on the client presence manager to decide whether to send push notifications. 2020-09-03 11:04:48 -04:00
Jon Chambers 697c380cd1 Close websocket connections when displaced. 2020-09-03 11:04:48 -04:00
Jon Chambers 81e8143a43 Rely solely on the clustered message cache. 2020-09-02 11:57:33 -04:00
Jon Chambers 8409986ef5 Mirror persistence operations from the new persister to the old persister. 2020-09-02 11:02:40 -04:00
Jon Chambers 2b50367d7f Put message persisters behind feature flags. 2020-09-02 11:02:40 -04:00
Jon Chambers 1dcc491fec Move cache-mirroring operations to the calling thread. 2020-09-01 12:34:37 -04:00
Ehren Kret d715f86713 Refactor to constants 2020-09-01 10:55:26 -04:00
Ehren Kret 5221828705 Increase maximum sticker size to 300 kibibytes
In preparation for animated stickers, allow stickers to be up to 300
kibibytes.
2020-09-01 10:55:26 -04:00
Jon Chambers 6aa4acd3db Mirror "clear queue" operations to the clustered cache. 2020-09-01 10:55:07 -04:00
Jon Chambers 15936c29c1 Let Dropwizard manage the lifecycle of the feature flag manager. 2020-09-01 10:50:59 -04:00
Jon Chambers 8b70c69a0d Replace metrics with logging statements. 2020-08-31 15:57:17 -04:00
Jon Chambers dfe80a30dc Make ScourMessageCacheCommand a ConfiguredCommand instead of an EnvironmentCommand. 2020-08-31 15:57:17 -04:00
Jon Chambers ce026e7ad0 Don't send contacts to CDS if they've opted out of discoverability. (SERVER-130) 2020-08-27 15:58:02 -04:00
Jon Chambers 58e3122dab Add a discoverableByPhoneNumber account attribute. (SERVER-129) 2020-08-27 15:58:02 -04:00
Jon Chambers 3b55b2d1b2 Actually make the "scour message cache" available to Dropwizard. Oops. 2020-08-27 15:15:04 -04:00
Jon Chambers 2326e61de5 Clear and re-create gauges to avoid "stuck" feature flag reporting. 2020-08-27 13:18:12 -04:00
Jon Chambers 32b18c9509 Add an endpoint for getting the current state of feature flags. 2020-08-27 13:18:12 -04:00
Jon Chambers acf52ad8a3 Make feature flag manager tests use a real database to avoid over-mocking. 2020-08-27 13:18:12 -04:00
Jon Chambers 08dd493f98 Don't report exceptions as part of traffic metrics. 2020-08-27 13:17:57 -04:00
Jon Chambers 07bbe7dfb2 Return to an async model for push notification latency. 2020-08-27 10:51:44 -04:00
Jon Chambers 0aa1b80e3e Add a command for persisting any detached messages in the old message cache. 2020-08-27 10:51:12 -04:00
Jon Chambers 5ac390281e Add an abstract base class for Redis singleton tests. 2020-08-27 10:51:12 -04:00
Jon Chambers ac465c5a18 Add a Lettuce-based Redis singleton client. 2020-08-27 10:51:12 -04:00
Jon Chambers 1ef3546822
Add support for server-side feature flags 2020-08-26 20:27:33 -04:00
Jon Chambers e74ad2b555 Make RedisClusterMessagesCache a Managed class. 2020-08-25 10:58:01 -04:00
Jon Chambers 71c0056c66 Use lots of specific subscriptions instead of one monster subscription to minimize load. 2020-08-25 10:58:01 -04:00
Jon Chambers 56b27ea785 Record experiment outcomes with timers instead of counters. 2020-08-25 10:57:44 -04:00
Jon Chambers 2d75f59d33 Add support for UUID-only delivery certificates. (SERVER-132) 2020-08-20 17:05:53 -04:00
Jon Chambers a709a3bcc0 Remove a candidate metric provider. 2020-08-20 15:40:56 -04:00
Jon Chambers 34bf5112e0 Drop TimeProvider. 2020-08-20 15:40:24 -04:00
Jon Chambers bfe18d1d28 Re-nerf the clustered message persister. 2020-08-20 15:38:09 -04:00
Jon Chambers 6a76afc20d Add a test to make sure the persister is respecting persist delays. 2020-08-20 15:38:09 -04:00
Jon Chambers 9c469c2f96 Base persister tests on a real Redis cluster. 2020-08-20 15:38:09 -04:00
Jon Chambers 2ab42f3dd6 Refine and expand clustered message cache metrics. 2020-08-19 11:39:05 -04:00
Jon Chambers af34b43a8d Reactivate the message notification experiment. 2020-08-19 11:39:05 -04:00
Jon Chambers 0f71cc7864 Rename metrics associated with cluster circuit breakers for clarity. 2020-08-18 17:59:00 -04:00
Jon Chambers df90de3a5f Change default Lettuce command timeout to 10s. 2020-08-18 16:21:42 -04:00
Jon Chambers 42ea7a9814 Revert Lettuce connection pooling. 2020-08-18 16:21:42 -04:00
Jon Chambers c683cbdb2d Time Redis operations. 2020-08-18 12:20:12 -04:00
Jon Chambers d243b73678 Make Lettuce connection pools configurable. Double the default size. 2020-08-18 12:20:12 -04:00
Jon Chambers dc28d063aa Reactivate the explicit client presence experiment. 2020-08-17 11:34:27 -04:00
Jon Chambers bb6045c1d0 Disarm the client presence manager experiment. 2020-08-15 20:23:05 -04:00
Jon Chambers f1a74b5939 Disarm new message keyspace notifications. 2020-08-15 20:23:05 -04:00
Jon Chambers 6fb9038af1 Move to a synchronous, pooled connection model for Redis clusters. 2020-08-14 17:15:56 -04:00
Jon Chambers 27f721a1f5 Update to resilience4j 1.5.0. 2020-08-14 17:15:56 -04:00
Jon Chambers 5717dc294e Combine the read/write breakers for Redis clusters. 2020-08-14 17:15:56 -04:00
Jon Chambers ae0f8df11b Break out FaultTolerantPubSubConnection as its own thing so different use cases can have their own subscription space. 2020-08-14 17:15:56 -04:00
Jon Chambers 77460ba502 Remove keyspace notification configuration checks because AWS doesn't support `CONFIG GET`. 2020-08-13 15:32:25 -04:00
Jon Chambers f8235da4d8 Fix an issue where the queue for a thread pool was not bounded. 2020-08-13 12:46:11 -04:00
Jon Chambers 8d3316ccd6 Listen for new messages via keyspace notifications. 2020-08-13 12:17:04 -04:00
Jon Chambers 2c29f831e8 Add an explicit client presence system. 2020-08-13 10:56:26 -04:00
Jon Chambers 9457325119 Add pub/sub affordances to FaultTolerantRedisCluster. 2020-08-13 10:56:26 -04:00
Jon Chambers 189f8afcc9 Warm up the test cluster before running tests to avoid transient startup jitters. 2020-08-13 10:56:26 -04:00
Jon Chambers f3a34990ab Update to Lettuce 5.3.3. 2020-08-12 16:57:23 -04:00
Jon Chambers 9699b67510 Record the size of outgoing message lists. 2020-08-12 16:57:10 -04:00
Jon Chambers d60633a46c Add a meter for the number of messages we send via websocket connections. 2020-08-12 16:57:10 -04:00
Jon Chambers 0fcf28e7e7 Use the MessagesManager to actually persist messages. 2020-08-11 15:50:22 -04:00
Jon Chambers 5fad8f74b1 Factor MessagePersister into its own class. 2020-08-11 15:50:22 -04:00
Jon Chambers e35e34d2e0 Move operation-mirroring logic to MessagesManager. 2020-08-11 15:50:22 -04:00
Jon Chambers 31a215d4d6 Use "global." instead of "g." as the prefix for global config options. 2020-08-11 11:55:35 -04:00
Jon Chambers 30948de13d Update a metric provider dependency and remove a workaround for an upstream issue. 2020-08-11 11:02:38 -04:00
Ehren Kret b97158bf7b
Create global remote config controllable in the signal server configuration (#127)
* Add global config controller through file rather than database

* Do no permit attempting to set or delete global config entries
2020-08-10 16:31:15 -05:00
Jon Chambers 6646be8d94 Make CpuUsageGauge a CachedGauge. 2020-08-10 12:56:37 -04:00
Jon Chambers 647a2aea64 Cache a reference to the OS management bean to avoid repeated lookups. 2020-08-10 12:56:37 -04:00
Jon Chambers 58e58ce51c Remove a candidate metric provider. 2020-08-10 11:03:20 -04:00
Ehren Kret 4b7e48d3ec
Override default ingestion URI for SignalFx (#131) 2020-08-07 15:29:42 -05:00
Ehren Kret 0e074d3a5a Copy SignalFxMeterRegistry into a new class to get better logging 2020-08-07 16:01:56 -04:00
Ehren Kret ea00224e7f
Add support for reporting metrics to signalfx (#129) 2020-08-07 11:10:31 -05:00
Jon Chambers 38293efe75 Keep a running count of the number of open websockets. 2020-08-06 16:07:34 -04:00
Jon Chambers 3286c5e174 Disable Redis persistence for tests. 2020-08-06 11:22:51 -04:00
Jon Chambers e0f8a28f38 Close connections before closing the whole cluster client. 2020-08-06 11:22:31 -04:00
Jon Chambers bf1b00b163 Drop a spurious RedisClusterClient. 2020-08-06 11:22:31 -04:00
Ehren Kret 4fa3a136ad
Remove arbitrary SMS and add a NANPA message service (#123)
* Remove arbitrary SMS code

This code has run its course and is no longer needed for now.

* Add elements to sample config that were left out

* Add a messaging service for NANPA

* Fixup sample config capitalization
2020-08-05 13:35:11 -05:00
Jon Chambers 178a6bd66e Log the top-level exception name and message when crawling badness happens. 2020-08-05 11:23:16 -04:00
Ehren Kret 57e1339230
Further restrict user agent pattern matching (#120)
* Further restrict user agent pattern matching

* Add static qualifier to method
2020-08-04 12:58:16 -05:00
Jon Chambers 4144423227 Publish percentiles for Micrometer distributions/timers. 2020-08-04 10:58:59 -04:00
Jon Chambers 4d03514142 Add a command for clearing the messages cache cluster. 2020-08-04 10:58:41 -04:00
Jon Chambers 0bc5566976 Mirror delete-after-persist operations to the clustered message cache. 2020-08-04 10:58:41 -04:00
Jon Chambers 925567add5 Actually "plug in" the reglock counter. 2020-08-03 15:43:33 -04:00
Jon Chambers ad97731d46 Reduce the maximum number of versions in play to 1,000. 2020-08-03 15:42:15 -04:00
Jon Chambers 40684a93a2 Restrict user-agent version matching to a more confined space. 2020-08-03 15:42:15 -04:00
Jon Chambers f3b644ceb8 Update the push latency manager to use UUIDs and a Redis cluster. 2020-08-03 15:36:02 -04:00
Jon Chambers 901ba6e87f Added a push latency manager. 2020-08-03 15:36:02 -04:00
Jon Chambers 76389bd584 Clear would-be-persisted messages from the cache cluster, but don't store them to the database. 2020-07-30 19:14:39 -04:00
Jon Chambers 7bf8650d59 Un-manage FaultTolerantRedisCluster so it shuts down at JVM shutdown instead of Jetty shutdown. 2020-07-30 18:37:38 -04:00
Ehren Kret 7cb24dd96d Add environment tag to datadog metric reporting 2020-07-30 18:04:16 -04:00
Ehren Kret dee040318a Add the host tag to datadog metric reporting 2020-07-30 18:04:16 -04:00
Jon Chambers baf563e46d Temporarily disarm the actual persisting part of the message persister. 2020-07-30 17:12:37 -04:00
Jon Chambers e10246f10b Use Dropwizard timers/histograms for persister metrics. 2020-07-30 14:27:06 -04:00
Jon Chambers a9dfd88671 Start the clustered message persister at application startup. 2020-07-30 12:32:35 -04:00
Jon Chambers beac73b6c8 Add a cluster-capable message persister 2020-07-30 11:39:14 -04:00
Jon Chambers f9f93c77e2 Use UUIDs instead of phone numbers as account identifiers in clustered message cache 2020-07-30 11:39:14 -04:00
Jon Chambers 6fc1b4c6c0 Add a cluster-backed message cache. 2020-07-30 11:39:14 -04:00
Jon Chambers 639898ec07 Expand Experiment to deal with async suppliers and Optionals. 2020-07-30 11:39:14 -04:00
Jon Chambers 3d3790fdbc Add binary execution methods to ClusterLuaScript. 2020-07-30 11:39:14 -04:00
Jon Chambers 69c8968cb0 Add byte-array-based methods to FaultTolerantRedisCluster. 2020-07-30 11:39:14 -04:00
Jon Chambers aa25fc7901 Fix UsernamesManager metric/logger names. 2020-07-29 11:00:29 -04:00
Jon Chambers 4aba493ee2 Fix the key used for database crawler workers. 2020-07-29 10:58:06 -04:00
Jon Chambers b9cfac5934 Introduce additional metric aggregators. 2020-07-28 15:11:51 -04:00
Brian Acton f8e97fcc32 revise 12 hour active user fudge to 8 hours for better continuity of data from a month ago 2020-07-28 11:09:41 -07:00
Jon Chambers 7f8f2641f6 Simplify registration lock counting by avoiding inactive accounts. 2020-07-28 11:48:20 -04:00
Jon Chambers 022dbb606f Count registration lock versions when crawling the account database. 2020-07-28 11:48:20 -04:00
Jon Chambers fea72b190d Record message content size as a dimensioned distribution. 2020-07-28 11:47:56 -04:00
Jon Chambers eea073f882 Decommission the old cache. 2020-07-28 10:29:28 -04:00
Jon Chambers fc1d88f5bb Read exclusively from the cache cluster. 2020-07-27 15:11:40 -04:00
Jon Chambers acbe410e0b Remove a metric aggregator. 2020-07-27 12:50:49 -04:00
Ehren Kret 89bafea61f Move SMS strings to configuration 2020-07-27 11:23:21 -05:00
Jon Chambers ec072fd639 Update telemetry dependencies and resolve dependency conflicts. 2020-07-24 18:59:35 -04:00
Jon Chambers 67b03076d7 Add a missing dependency. 2020-07-24 17:21:56 -04:00
Jon Chambers 33a0c4a9ae Use first party metric aggregator libraries where possible. 2020-07-24 17:21:56 -04:00
Jon Chambers 4cc5999f05 Configure additional metric aggregators. 2020-07-23 13:31:19 -04:00
Jon Chambers 403aa5fd3e Add dependencies for additional metric aggregators. 2020-07-23 13:31:19 -04:00
Jon Chambers 0fbf31ec98 Clear each cluster node individually. 2020-07-22 11:12:21 -04:00
Jon Chambers db9b7ca447 Fix slot assignment when building a cluster for tests. 2020-07-22 11:04:10 -04:00
Jon Chambers eecc71c77f
Revert batch message storage. (#95) 2020-07-20 16:28:32 -04:00
Jon Chambers 5f898a9071 Measure inserted message batch size. 2020-07-20 10:30:29 -04:00
Jon Chambers a08f21336a Be explicit about transaction management. 2020-07-20 10:30:29 -04:00
Jon Chambers 215125de26 Update tests. 2020-07-20 10:30:29 -04:00
Jon Chambers dfa94eac41 Store messages in batches. 2020-07-20 10:30:29 -04:00
Jon Chambers 247d869a5c De-randomize message tests to minimize flakiness. 2020-07-14 18:46:39 -04:00
Ehren Kret b9b6e1818f Rename SenderIdSelector to SenderIdSupplier per code review discussion 2020-07-14 10:53:48 -05:00
Ehren Kret a7968ccc3c Address code review comments 2020-07-14 10:53:48 -05:00
Ehren Kret b7e0e5a356 Create a strategy class to decide which sender id to use
The rules around selecting sender ids can get complicated with some
countries not supporting it and others requiring pre-registration that
may result in having a different sender id for that country than
others. This strategy class handles the logic of dealing with this
expanded configuration and applying the appropriate sender id or none
when it's not appropriate to do so at all.
2020-07-14 10:53:48 -05:00
Brian Acton e3aecb2aa9 apply a 12 hour fudge to daily user counting to account for last seen timestamp fuzzing 2020-07-09 17:43:12 -07:00
Jon Chambers 116ab83b95 Include a PushType header when sending APNs notifications. 2020-07-09 16:12:20 -04:00
Jon Chambers c5d0d4acd0 Revert "Move rate limiter logic to Lua scripts"
This reverts commit b585c6676d.
2020-07-09 12:30:25 -04:00
Jon Chambers 06190286ec Remove temporary circuit breaker suppression. 2020-07-07 16:33:05 -04:00
Jon Chambers 3bca856e87 Remove a pair of spurious SET calls in the rate limiter script. 2020-07-07 16:33:05 -04:00
Jon Chambers b3a778b89a Temporarily catch and log all script execution exceptions to avoid opening the breaker. 2020-07-07 15:17:25 -04:00
Jon Chambers dcb11f7606 Log errors from experiments. 2020-07-07 15:17:25 -04:00
Jon Chambers 933ce42d5a Test rate limiters against a real cluster. 2020-07-07 15:17:25 -04:00
Ehren Kret 6c1ba957bd Ensure the default alphaId configuration is an empty list rather than null 2020-07-07 10:17:40 -05:00
Ehren Kret e021286eee Add configuration by country for sending from alpha IDs 2020-07-07 10:17:40 -05:00
Ehren Kret 0ee7a66033
Keep trying ports until you get one lower than 55535 (#83)
* Keep trying ports until you get one lower than 55535

* Rename method and change to do...while

* Limit attempts to 11,000 to find an open redis cluster port
2020-07-07 10:12:31 -05:00
Jon Chambers 42c797ee97 Set the default log level for tests to WARN. 2020-07-07 11:05:39 -04:00
Jon Chambers b585c6676d
Move rate limiter logic to Lua scripts 2020-07-06 10:10:13 -04:00
Jon Chambers f5ddb0f1f8 Test ClusterLuaScript against a real Redis cluster. 2020-07-02 18:58:30 -04:00
Jon Chambers ef97f9e738 Revert "Temporarily suspend execution of the "unlock" script."
This reverts commit 6aecd8d44a.
2020-07-02 18:58:30 -04:00
Jon Chambers 26a03b55de Un-reinvent the clustered script execution wheel. 2020-07-02 18:58:30 -04:00
Jon Chambers b93a16abae Honor the step size set in the micrometer config. 2020-07-02 11:40:41 -04:00
Jon Chambers ff2783d434 Fixed a goof where we were mirroring a write to the wrong key in the new cache cluster. 2020-07-02 11:40:27 -04:00
Ehren Kret 25a5a8db68
Set avatar to null on Account when request is false (#78) 2020-06-29 15:53:31 -05:00
Jon Chambers a68d91b54c Resolve some test flakiness by adding a deterministic "wait" mechanism. (SERVER-86) 2020-06-29 12:24:25 -04:00
Jon Chambers 88ec3a5751 Add a counter for dead letter events. 2020-06-26 09:00:11 -04:00
Jon Chambers 734dc2e37a Don't block the Redis instance when clearing the cache. 2020-06-19 10:52:18 -04:00
Jon Chambers 6aecd8d44a Temporarily suspend execution of the "unlock" script. 2020-06-17 22:27:02 -04:00
Jon Chambers bbf5e1fa78 Use the UA string from websocket upgrade requests if available. 2020-06-17 15:40:18 -04:00
Jon Chambers 7454e55693 Write synchronously to the cache cluster. 2020-06-17 15:38:56 -04:00
Jon Chambers c745fe7778 Fix a poorly-mirrored cache delete operation. 2020-06-17 15:35:46 -04:00
Jon Chambers 6adcebb247 Return to just using counters instead of timers for measuring experiment outcomes. 2020-06-17 15:34:02 -04:00
Jon Chambers 38f9b8f3dd Make write operations in `AccountDatabaseCrawlerCache` synchronous. 2020-06-17 10:05:43 -04:00
Jon Chambers 7faf143a97 Subdivide the account database crawler cache experiment and add logging to track down lingering disagreements. 2020-06-17 09:23:40 -04:00
Jon Chambers 17cfd4924c Fixed a poorly-mirrored write operation to the new cluster. 2020-06-16 16:46:41 -04:00
Jon Chambers a0bebca1e6 Extend Experiment to report more detail when results don't match. 2020-06-16 16:46:41 -04:00
Jon Chambers 75cbfa2898 Mirror unlock-via-script calls to the cache cluster. 2020-06-16 16:46:41 -04:00
Jon Chambers 58a8ed1588 Add a cluster-friendly version of LuaScript. 2020-06-16 16:46:41 -04:00
Jon Chambers e032f8df59 Add a command for clearing the cache cluster. 2020-06-16 16:46:41 -04:00
Jon Chambers b16e37d80a Record a histogram of incoming message list sizes. 2020-06-12 14:43:50 -04:00
Jon Chambers c17cc07b73 Instrument BlockingThreadPoolExecutor. 2020-06-12 14:43:50 -04:00
Jon Chambers 6f767a72a7 Add a timer for the private sendMessage method. 2020-06-12 14:43:50 -04:00
Jon Chambers 11196436e9 Time rate limiter validation calls. 2020-06-12 14:43:50 -04:00
Jon Chambers 9afc433db4 Record exceptions associated with server responses. 2020-06-11 22:08:07 -04:00
Jon Chambers f701e3d834 Record distributions of timer values; stop recording error causes. 2020-06-11 11:50:36 -04:00
Jon Chambers 4c623ca3c5 Compare Redis reads using Lettuce's synchronous path. 2020-06-11 11:50:36 -04:00
Jon Chambers 0671f05c05 Introduce experiment comparison methods for suppliers. 2020-06-11 11:50:36 -04:00
Jon Chambers 0713da7393 Record experiment results with a timer instead of a counter. 2020-06-11 11:50:36 -04:00
Jon Chambers 05955d0483 Check for null header values before trying to iterate through them. 2020-06-09 15:45:32 -04:00
Jon Chambers 28c765bd9a Add an in-app-context test for websocket metrics. 2020-06-09 15:45:32 -04:00
Ehren Kret 8287317be7 Add account device ID to the prekey rate limiter
This limits prekey fetching per device on an account instead of on an
account level.
2020-06-09 10:20:10 -07:00
Jon Chambers ec858b2d4c
Set a timeout for Redis cluster operations and shut down the cluster as part of service shutdown 2020-06-07 18:27:57 -04:00
Jon Chambers 47ece983d2 Added a Redis cluster health check. 2020-06-07 18:27:11 -04:00
Jon Chambers 52310b5dd9 Compare results of reads from old and new Redis caches. 2020-06-07 18:27:11 -04:00
Jon Chambers c2a4a2778e Introduce the Experiment class to compare results from parallel systems. 2020-06-07 18:27:11 -04:00
Jon Chambers 1db5977e80 Mirror username deletes unconditionally. 2020-06-07 18:27:11 -04:00
Jon Chambers 1b5dc0e434 Fixed a potential issue where locks could get out of sync between Redis instances. 2020-06-07 18:27:11 -04:00
Moxie Marlinspike f07f02d866 Deliver upgrade link to stale clients 2020-06-06 18:20:55 -07:00
Jon Chambers 1388103919 Mirror writes to the cache cluster. 2020-06-06 20:37:48 -04:00
Jon Chambers fe1054d58a Introduce a Lettuce-based fault-tolerant Redis cluster accessor. 2020-06-06 20:37:48 -04:00
Jon Chambers ba6ac778fc Update to Pushy v0.14.1. 2020-06-05 12:21:56 -04:00
Jon Chambers 228ffcbfce Differentiate between websocket and "boring" HTTP traffic. 2020-05-28 12:52:49 -04:00
Jon Chambers f18ab9e5cc Measure traffic from websockets. 2020-05-28 12:52:49 -04:00
Jon Chambers 06c82ee87d Celebrate the diversity of UA strings when generating tags for metrics. 2020-05-27 19:35:42 -04:00
Jon Chambers 9ba5ee8043 Move UA tag extraction into its own utility class. 2020-05-27 19:35:42 -04:00
Ehren Kret eede4e50ca
Use hashed UUID to spread last seen updates over a full day (#40) 2020-05-26 13:38:52 -07:00
Jon Chambers aa10f63d9f Add the timestamp using the `add` method. 2020-05-22 17:39:25 -04:00
Jon Chambers a25af36e32 Include timestamps in all server-to-client websocket messages. 2020-05-22 15:13:39 -04:00
Jon Chambers 817f057927 Inject timestamps into responses. 2020-05-22 15:13:39 -04:00
Jon Chambers a13c44d81a Capture request-level metrics (path, status, client platform/version). 2020-05-20 17:48:19 -04:00
Jon Chambers 45ad8f8ffb Add the Wavefront/Micrometer reporter as a dependency and configure a registry. 2020-05-20 17:46:07 -04:00
Ehren Kret 7da9e88c0b Add hashKey to RemoteConfig
This allows the percentages for different entries in remote config to
be aligned so one remote config can be a subset of another.
2020-05-13 11:08:22 -07:00
Jon Chambers 1c73c91133 Report the number of days until the CDS CA cert expires as a metric so we can set an alarm. 2020-05-12 12:57:11 -04:00
Jon Chambers b1d11d4f69 Use APNs signing keys instead of expiring certificates. 2020-05-12 12:48:28 -04:00
Jon Chambers 001a9310c3
Support device transfers (SERVER-41, SERVER-42) (#32)
This change introduces a `transfer` device capability and account creation argument in support of the iOS device transfer effort.
2020-05-12 12:23:18 -04:00
Moxie Marlinspike 8ffadfa1f1 Add payment addresses on account attributes update 2020-05-07 09:52:38 -07:00
Jon Chambers 50d7929e76 Drop the GCM `RECEIPT` message type (unused). 2020-05-04 17:51:54 -04:00
Jon Chambers 10840b22c5 Don't let one unregistered device block receipt for others. 2020-05-04 17:51:25 -04:00
Jon Chambers acfbab5915 Update to Pushy v0.13.11. 2020-05-04 17:50:35 -04:00
Ehren Kret 48c324fe86 Use a static sequence of randomness in tests
The RemoteConfigControllerTest#testMath unit test would occassionally
fail because randomness doesn't necessarily group into expected ranges
over a finite trial count. This changes the test to use a predefined
PRNG sequence instead of one that varies with each test so that the
test will no long randomly fail.
2020-04-29 17:31:43 -07:00
Ehren Kret 0c495e7e72 Workaround lack of internal retry on transaction rollback
The get endpoint for key fetching can fail if the transaction cannot
complete because of simultaneous modification. Clients currently
receive 500 from this and retry if it happens, but this test case runs
into it without retrying and then complains that not all the threads
completed successfully. This workaround adds some retry attempts.
2020-04-29 17:10:13 -07:00
Ehren Kret 50ccfee201 Allow remote config to send non-boolean values
This version of remote config allows non-boolean values to be returned
to clients but unfortunately limits the configuration to only one
value or another. There is no way to configure more than two values
for the same key with this setup.
2020-04-29 10:51:10 -07:00
Moxie Marlinspike fa739c9594 Bump zkgroups to 0.7.0 2020-04-28 08:58:57 -07:00
Moxie Marlinspike 95f0ce1816 Support for advertising payment addresses on profile 2020-04-22 12:32:53 -07:00
Moxie Marlinspike a32c8fabed Temporarily move GV2 capability from allMatch to anyMatch 2020-04-20 13:42:36 -07:00
Moxie Marlinspike 6a11501184 Bump zkgroups to 0.6.0 2020-04-20 13:41:54 -07:00
Moxie Marlinspike c03fd4645d Bump zkgroups to 0.5.0 2020-04-09 20:36:34 -07:00
Moxie Marlinspike b76c7a4824 Update zkgroups to 0.4.2 2020-04-09 11:21:58 -07:00
Moxie Marlinspike 1408ac77f9 Make storageCapable a boolean result rather than an auth token 2020-04-09 10:19:49 -07:00
Ehren Kret 7e97d10ae1 Fix account dropping new style registration locks 2020-04-06 09:27:23 -07:00
Ehren Kret 56b134facd Change attachment key from long to base64 of 15 bytes 2020-04-02 10:20:42 -07:00
Ehren Kret 41286650cc Create attachments V3 endpoint for CDN2 on GCP
In preparation for resumable uploads, this creates a separate
attachment authorization endpoint that creates a signed URL for
accessing GCP Storage through Signal's CDN2. This should allow Signal
clients to do byte-level resume of media uploads.
2020-04-02 10:20:42 -07:00
Moxie Marlinspike 3c8e7c6c10 Add storage capability and return KBS creds on rereg w/ storage set 2020-03-27 10:45:48 -07:00
Moxie Marlinspike 4f64513c83 Break out redis pubsub into dedicated cluster 2020-03-16 17:44:42 -07:00
Moxie Marlinspike 350f5ccb3c Account for fronted regions 2020-03-14 19:07:42 -07:00
Moxie Marlinspike ac1153c7cf Additional limits 2020-03-14 18:10:07 -07:00
Moxie Marlinspike 3b1672a4a7 Update zkgroups to 0.4.0 2020-03-14 16:30:13 -07:00
Moxie Marlinspike 009f81a9a6 Update to dropwizard 2.x 2020-03-14 16:30:13 -07:00
Moxie Marlinspike 8b10b1dc62 Remove tombstone column from keys table 2020-02-25 12:25:34 -08:00
Moxie Marlinspike 077c259d5b Migrate keys to accountsdb 2020-02-23 17:59:30 -08:00
Moxie Marlinspike e5746c19cf Support for GV2 capability flag 2020-02-07 11:53:28 -08:00
Moxie Marlinspike e399f9e851 Generate external creds for KBS based on UUID 2020-01-22 13:47:33 -08:00
Moxie Marlinspike e4e20c2d25 Add support for UUID buckets in remote config 2020-01-22 11:28:08 -08:00
Moxie Marlinspike 08a70664f4 Support for getting/setting remote config variables 2020-01-21 13:38:58 -08:00
Moxie Marlinspike 1d76c644cb Update version of embedded pg 2020-01-21 13:03:55 -08:00
Moxie Marlinspike 75fc35ee4b Parameterize access to zk operations 2020-01-21 11:29:08 -08:00
Moxie Marlinspike ba3102d667 Support for versioned profiles
Includes support for issuing zkgroup auth credentials
2020-01-21 11:04:06 -08:00
Moxie Marlinspike 8a9fed64f2 Support for first/last profile name length 2020-01-13 18:55:04 -08:00
Moxie Marlinspike 71c7e30548 Increase max size for sticker manifest 2019-12-19 10:29:47 -08:00
Moxie Marlinspike 940bd55079 Update libphonenumber to 8.11.0 2019-12-18 17:32:39 -08:00
Moxie Marlinspike 886db1a2c3 Bump max sticker count to 201 2019-12-18 17:08:51 -08:00
Moxie Marlinspike b4c06db031 Make redis failures on write-back retrieve non-fatal 2019-11-20 12:36:22 -08:00
Moxie Marlinspike 82486a873a Delete old username mapping when setting new one 2019-11-20 12:36:22 -08:00
Moxie Marlinspike 99760ba6a0 Put UUID on server-generated delivery receipt 2019-11-20 12:36:22 -08:00
Moxie Marlinspike 2b987e6e93 Usernames can't start with numbers 2019-11-20 12:36:22 -08:00
Moxie Marlinspike 523134f24b Username reservation table 2019-11-20 12:36:22 -08:00
Moxie Marlinspike 99c228dd6d Support for setting and looking up usernames 2019-11-20 12:36:22 -08:00
Moxie Marlinspike 44d38a00d4 Fix capabilities NPE 2019-11-14 13:36:40 -08:00
Moxie Marlinspike c623f70caa Add support for capabilities 2019-11-14 13:36:40 -08:00
Jeffrey Griffin f16b783378 return backup, not storage, credentials for reg lock 2019-11-05 10:36:33 -08:00
Moxie Marlinspike a8c932ffe4 Update dropwizard to 1.3.16 2019-10-30 19:32:40 -07:00
Brian Acton be4b75932b since onCrawlChunk() is now protected, we need to invoke timeAndProcessChunk() in our unit tests 2019-10-29 18:20:03 -07:00
Jeffrey Griffin 04d7f3a5dc allow disabled accounts to get KBS auth 2019-10-29 16:50:47 -07:00
Brian Acton eddfacd0f4 add timers to the account crawler listeners 2019-10-25 21:30:48 -07:00
Jeffrey Griffin 69742839c0 uuid-based account crawler 2019-08-27 14:42:14 -07:00
Moxie Marlinspike 20b5f0e681 Reset cache index 2019-08-27 14:08:50 -07:00
Moxie Marlinspike 3803b8f284 Fix for jedis pool deadlock
1) Remove nested pool checkouts

2) Add a max wait so it won't block forever on deadlock
2019-08-27 14:02:42 -07:00
Moxie Marlinspike e3daf743f2 Fix new account calculation 2019-08-27 11:14:11 -07:00
Moxie Marlinspike ae5da74bb1 Update banner 2019-08-26 16:08:30 -07:00
Jeffrey Griffin cf78047830 revert to phone number-based account crawler 2019-08-26 14:00:15 -07:00
Moxie Marlinspike 284428a45a Support for authentication to KBS 2019-08-26 11:09:54 -07:00
Moxie Marlinspike 79f2efdfd9 Make UUID in sealed sender certificate optional for buggy clients 2019-08-26 11:09:54 -07:00
Jeffrey Griffin 07822b371f replicate uuids to contact discovery 2019-08-26 11:09:54 -07:00
Moxie Marlinspike 7a3a385569 Support for UUID based addressing 2019-08-26 11:09:54 -07:00
Moxie Marlinspike e57f78cf90 Add meter for GCM challenge transmissions 2019-08-01 13:30:49 -07:00
Moxie Marlinspike 10724fee04 Support for sticker pack uploads 2019-07-24 16:29:56 -07:00
Moxie Marlinspike 4d09bae09b Add some logging 2019-07-11 19:57:31 -07:00
Moxie Marlinspike 11902dec3c Support for v2 registration lock 2019-07-11 18:15:14 -07:00
Moxie Marlinspike 4fdbe9b9ff Support for push preauth 2019-07-11 18:15:10 -07:00
Moxie Marlinspike a6e7e30177 Add requester to recaptcha validation 2019-07-11 17:31:34 -07:00
Moxie Marlinspike 5b69ff7e94 Break out keys database and accounts database 2019-06-19 17:16:37 -07:00
Moxie Marlinspike bc0c6be4c5 We don't need to support disabled accounts for the signed PK API 2019-06-12 12:32:15 -07:00
Moxie Marlinspike f56d219882 Update dropwizard to 1.3.12 2019-06-11 09:29:30 -07:00
Moxie Marlinspike 3c6b418ca8 Publish fcm retry metrics 2019-05-30 11:05:05 -07:00
Moxie Marlinspike 105a38a7db Update gcm-sender-async to use jdk11 httpclient 2019-05-30 10:46:40 -07:00
Moxie Marlinspike e6f25b9c5e Bring gcm-sender-async in as a module 2019-05-29 11:03:33 -07:00
Moxie Marlinspike 6e0b956e61 Only set the uninstall feedback timestamp when it's zero
Otherwise each send will update the timestamp, preventing it from
aging out to the point where the cleaner will pick it up.
2019-05-26 14:27:30 -07:00
Moxie Marlinspike a029768d24 Reenable account cleaner 2019-05-10 10:42:42 -07:00
Moxie Marlinspike 4d9c9206cf Delay processing FCM uninstalled feedback
Check to make sure client is not still active before unregistering,
since FCM feedback seems to be often erroneous
2019-05-07 10:04:22 -07:00
Moxie Marlinspike 35116f9229 Clean up concepts of enabled account state
1) Rename "active" methods to be "enabled," since they aren't
   really about "activity."

2) Make authentication fail if a device or account is in dissabled
   state.

3) Let some controllers authenticate accounts that are in a
   disabled state.
2019-05-04 12:31:50 -07:00
Moxie Marlinspike a1f90cd39b Temporarily disable account cleaner 2019-05-03 12:09:01 -07:00
Moxie Marlinspike 45dc7459b8 Temporarily disable GCM unregistered feedback 2019-05-03 11:51:21 -07:00
Jeffrey Griffin 6877b663f1 enable up to 40 account updates per chunk in AccountCleaner 2019-05-03 10:58:57 -07:00
Jeffrey Griffin 3c69f81a10 expire accounts explicitly 2019-05-02 21:14:57 -07:00
Jeffrey Griffin d316d57e5d fix DirectoryController tests 2019-05-02 19:20:23 -07:00
Jeffrey Griffin 92eddf8eb6 Directory feedback v3 2019-05-02 15:49:27 -07:00
Moxie Marlinspike 0c81556b90 Switch websocket-resources from ListenableFuture to CompletableFuture 2019-05-02 15:05:44 -07:00
Moxie Marlinspike d72828b3f4 Fix assembly for multi-module 2019-05-01 14:02:18 -07:00
Moxie Marlinspike 9220f4d829 Add websocket-resources as a module 2019-05-01 13:19:15 -07:00
Moxie Marlinspike 66917cd2c0 Add some dependency exclusions 2019-05-01 13:19:15 -07:00
Moxie Marlinspike d0d375aeb7 Break out into a multi-module project 2019-05-01 13:19:11 -07:00