Commit Graph

825 Commits

Author SHA1 Message Date
Ehren Kret 0dcb4b645c
Build Dynamo DB backed Message Store (#358)
* Work in progress...

* Finish first pass draft of MessagesDynamoDb

* Use begins_with everywhere for destination device id

* Remove now unused methods

* First basic test built

* Add another test case

* Remove comment

* Verify more of the message contents

* Ensure all methods are tested

* Integrate MessagesDynamoDb into the MessagesManager

This change plugs the MessagesDynamoDb class into the live serving
flow in MessagesManager.

Tests are not yet as comprehensive for this big a change as they
should be, but they now compile and pass so checkpointing here with a
commit.

* Put DynamoDB before RDBS when deleting specific messages

* Extract method

* Make aws sdk version into a property

* Rename clientBuilder

* Discard messages with no GUID

* Unify batching logic into one function

* Comment on the source of the value in this constant

* Inline method

* Variable name swizzle

* Add timers to all public methods

* Add missing return statements

* Reject messages that are too large with response code 413

* Add configuration to control dynamo DB timeouts

* Set server timestamp from the ReceiptSender

* Change to shorter key names to optimize IOPS

* Fix tests broken by changing column names

* Fix broken copyright template output

* Remove copyright template error text

* Add experiments to control use of dynamo and rds in message storage

* Specify instance profile credentials for the dynamic configuration manager

* Use property for aws sdk version

* Switch dynamo to instance profile credentials

* Add metrics to the batch write loop

* Use placeholders in logging
2021-02-03 10:03:19 -06:00
Jon Chambers fc4c8d6054 Update to the latest version of libphonenumber. 2021-02-01 21:25:14 -05:00
Jon Chambers 1a27c7eabc Add a (failing) test for new Ivory Coast phone numbers. 2021-02-01 21:25:14 -05:00
Jon Chambers 408b959441 Require a push challenge when registering (or else require a captcha). 2021-02-01 20:44:21 -05:00
Jon Chambers 35fc98a188 Add an experiment enrollment manager. 2021-02-01 11:08:16 -05:00
Moxie Marlinspike 92f6a79e1f
Add a dynamic configuration manager 2021-02-01 11:01:58 -05:00
Jon Chambers 8f94ed68a3 Ignore expired devices when checking for GV1->GV2 migration capability. 2021-01-30 16:55:05 -05:00
Jon Chambers ce1a4b94cb Actually store emoji/about text in the database. 2021-01-27 10:34:13 -05:00
Jon Chambers 92a0deffcf Add more robust tests for about/emoji fields. 2021-01-27 10:34:13 -05:00
Jon Chambers 97b6f6028b Fix a minor typo in the help text for a feature flag task. 2021-01-25 18:03:38 -05:00
Jon Chambers 611e8c39ee Actually drop feature flag config. 2021-01-25 15:20:06 -05:00
Jon Chambers 01f1c263a6 Add a meter for captcha requests. 2021-01-25 14:58:27 -05:00
Jon Chambers 24ea6a9f1d Revert "Temporarily disable registration abuse system"
This reverts commit 22ef058cb6.
2021-01-25 14:58:27 -05:00
Jon Chambers 46c800b8b7 Smoosh request logging tasks together rather than having one task for each direction. 2021-01-25 14:58:15 -05:00
Jon Chambers f10be893ce Drop the old feature flag controller. 2021-01-25 14:55:57 -05:00
Jon Chambers c606c1664f Add admin tasks for listing, setting, and deleting feature flags. 2021-01-25 14:55:57 -05:00
Jon Chambers 225932b4c9
Add emojis/"about" text to profiles 2021-01-20 15:42:47 -05:00
Jon Chambers 6b850b9894 Allow (versioned) profile names up to 380 base64 characters long. 2021-01-20 11:08:10 -05:00
Jon Chambers 943a5d1036
Shard push scheduling cache 2021-01-19 15:50:12 -05:00
Moxie Marlinspike b25da8ceaa
Don't attempt SMS to iran (#355) 2021-01-19 09:13:37 -08:00
Ehren Kret 10cdb7387d Be consistent with use of DataSize class 2021-01-18 17:01:43 -06:00
Ehren Kret dd436dd1dd Create a Meter for tracking messages larger than 256kib 2021-01-18 17:01:43 -06:00
Jon Chambers 13b84635b5
Drop an unused message database index. (#352) 2021-01-18 10:26:03 -06:00
Moxie Marlinspike 27534d408f
Log when messages cache detects topology change (#354)
Co-authored-by: Moxie Marlinspike <moxie+github@signal.org>
2021-01-17 17:13:23 -08:00
Jon Chambers 0a23ce870a Allow message persisters to be disabled by a feature flag. 2021-01-17 11:13:12 -05:00
Jon Chambers c355ef8d53 Reduce the message cache thread pool size. 2021-01-16 11:15:25 -05:00
Jon Chambers 1feb23ba99 Stop periodic topology refreshes. 2021-01-16 03:35:36 -05:00
Jon Chambers 59a0fd0799 Embiggen message cluster thread pool. 2021-01-16 02:57:04 -05:00
Jon Chambers 00b5cfcf17 Allow the client presence manager to use an entirely separate cluster. 2021-01-16 02:57:04 -05:00
Jon Chambers 9e342f253d Use the same client for inserts and reads in the message cache cluster. 2021-01-16 01:50:40 -05:00
Jon Chambers 20c48b6bb2 Expand message-related thread pools to 1 thread per shard. 2021-01-16 01:50:40 -05:00
Jon Chambers 4f9e7bb572 Separate Lettuce thread pools. 2021-01-16 01:18:05 -05:00
Jon Chambers 0a322d5a9f Add a "doomsday switch." 2021-01-15 18:05:18 -05:00
Jon Chambers 59eb6d10c1 Gate based on destination rather than random. 2021-01-15 18:05:18 -05:00
Jon Chambers a57ce1dd17 Add machinery to allow a percentage of message sends to succeed. 2021-01-15 17:05:16 -05:00
Moxie Marlinspike b100b3c36b Reject traffic without logging exceptions 2021-01-15 16:23:53 -05:00
Jon Chambers 81c1ba6eef Respond to all "message send" attempts with HTTP/503. 2021-01-15 15:34:14 -05:00
Jon Chambers 93ae4d1ee6 Move the client presence manager to its own breaker. 2021-01-15 13:51:39 -05:00
Jon Chambers 9c53d818f4 Use separate clusters for message cache read/write operations. 2021-01-15 13:51:39 -05:00
Jon Chambers e5a2c1ab10 Always return an empty list of prekeys. 2021-01-15 12:27:10 -05:00
Jon Chambers 67ed035b36 Retry serializable key transactions. 2021-01-13 17:38:29 -05:00
Jon Chambers ad30786f4a Parallelize message persisters. 2021-01-12 18:50:14 -05:00
Jon Chambers 2e01da5ec1 Add a task to enable/disable accelerated crawling. 2021-01-11 19:29:18 -05:00
Jon Chambers 8fb37a0024 Log when a crawling cycle has wrapped up. 2021-01-11 19:29:18 -05:00
Jon Chambers 9412a7424c Return HTTP/429 whenever somebody tries to get contacts from the old directory system. 2021-01-11 19:29:10 -05:00
Jon Chambers f8cbb4f386 Temporarily suspend client version metrics to reduce load on our metric aggregator. 2021-01-11 14:04:44 -05:00
Ehren Kret 86ccaa52a5
Allow configuration of multiple directory account crawler listeners (#325)
* Allow configuration of multiple directory account crawler listeners

Only one should update the local redis directory. This one is marked
with replicationPrimary true. The others in the list only serve to
issue replication requests over to CDS replication load balancers.

* Update one more metric name
2021-01-10 17:11:02 -06:00
Jon Chambers cc3e5d23e4 Enable Lettuce adaptive topology refreshes. 2021-01-10 16:20:35 -05:00
Jon Chambers cac86d1f77 Standardize toplogy event handling strategy. 2021-01-10 15:14:12 -05:00
Jon Chambers 22f7bb822f Raise log level of toplogy changes. 2021-01-10 15:14:12 -05:00
Jon Chambers 1b53f10091 Reload scripts across the whole cluster if one shard is missing the script. 2021-01-10 15:00:12 -05:00
Jon Chambers bac268a21c Don't send a reply to clients until messages are safely in a non-volatile store. 2021-01-10 13:03:40 -05:00
Jon Chambers 321e6e6679 Don't validate cluster membership (allow new shards to join dynamically). 2021-01-10 12:58:35 -05:00
Moxie Marlinspike 22ef058cb6 Temporarily disable registration abuse system 2021-01-09 15:57:55 -05:00
Jon Chambers 9ee6419bc0 Publish directory updates to multiple SQS queues. 2021-01-08 18:07:18 -05:00
Jon Chambers 3bf0188e7f Turn off alphanumeric sender ID for all countries. 2021-01-08 06:18:53 -05:00
Jon Chambers 91fc0fd623 Revert "Delete data in the storage service when deleting accounts."
This reverts commit ff1a721d5b.
2021-01-08 06:18:39 -05:00
Jon Chambers d2fcf68381 Record the status message when clients reject websocket messages. 2020-12-23 12:29:15 -05:00
Jon Chambers a4d0c17efd Record OS versions for iOS requests. 2020-12-23 11:36:31 -05:00
Jon Chambers ff1a721d5b Delete data in the storage service when deleting accounts. 2020-12-23 11:35:38 -05:00
Jon Chambers c870a1bbd5 Introduce a storage service client. 2020-12-23 11:35:38 -05:00
Ehren Kret ebf332a8c9
Record delivery duration excluding noise from non-primary devices (#311)
* Record delivery duration excluding noise from non-primary devices

* Extract method
2020-12-21 10:28:39 -06:00
Jon Chambers 85d1fff18f Actually increment the Android request counter. 2020-12-11 11:46:07 -05:00
Jon Chambers 6bb106c2cb Drop the Redis command timeout back down to 3 seconds to facilitate debug data collection. 2020-12-11 11:20:10 -05:00
Jon Chambers e551fd2c1b Revert "Pause checks for GV1 migration when checking for capability downgrades."
This reverts commit e7745db36e.
2020-12-10 17:02:41 -05:00
Jon Chambers 34a11c2338 Record OS versions for desktop and SDK versions for Android. 2020-12-10 17:02:05 -05:00
Jon Chambers 0de3a400eb Record unsuccessful server-to-client requests in more detail. 2020-12-10 17:01:46 -05:00
Jon Chambers e524ff965d Add a utility method for getting client platform tags from UA strings for metrics. 2020-12-10 17:01:46 -05:00
Jon Chambers 7ba689aaeb Measure adoption of the `gv1-migration` capability. 2020-12-09 19:08:52 -05:00
Jon Chambers 92fde83b3a Discard oversized messages bound for desktop clients via websockets. 2020-12-07 15:03:35 -05:00
Jon Chambers 3a268aef50 Reduce logging level for Lettuce connection events. 2020-12-07 11:56:41 -05:00
Jon Chambers f673bd8d7b Set device capabilities when linking a new device. 2020-12-02 13:21:08 -05:00
Ehren Kret 299b680013
Always include UUID in UD certificate (#300) 2020-12-01 08:56:55 -06:00
Jon Chambers 81e8352391 Time (and count) SQS "send message" operations. 2020-11-25 15:05:05 -05:00
Jon Chambers 1a627d6a87 Extend Redis command timeout to 3.5 seconds to avoid TCP retransmission "coincidences." 2020-11-25 15:04:06 -05:00
Ehren Kret 00a3e562dc
Force use of UCS-2 instead of GSM-7 for SMS to China (#297) 2020-11-20 14:41:48 -06:00
Jon Chambers 0628c9161c Use named threads for the JsonMetricsReporter executor service. 2020-11-18 15:46:14 -05:00
Jon Chambers 9b28672e19 Honor disabled metric attributes in JsonMetricsReporter. 2020-11-18 15:46:14 -05:00
Jon Chambers d764058a04 Measure contact intersection rate directly. 2020-11-18 14:28:53 -05:00
Jon Chambers 0aafe38496 Stop recording Lettuce latency metrics. 2020-11-17 13:20:37 -05:00
Jon Chambers e7745db36e Pause checks for GV1 migration when checking for capability downgrades. 2020-11-17 09:25:12 -05:00
Jon Chambers 474b879b16 Only notify CDS if an account attribute change actually changes an account's discoverability. 2020-11-16 10:54:12 -05:00
Jon Chambers 0a23b57ff8 Report Dropwizard metrics via the Wavefront proxy. 2020-11-13 17:14:13 -05:00
Jon Chambers 251e1b51c5 Make Micrometer batch size configurable. 2020-11-13 17:13:39 -05:00
Jon Chambers 217d270457 Update to Lettuce 6.0.1. 2020-11-13 10:50:21 -05:00
Jon Chambers 143b6f0df1 Revert "Add a debug version of Lettuce to track down the cause of https://github.com/lettuce-io/lettuce-core/issues/1494."
This reverts commit 4d5fbec5a5.
2020-11-13 10:50:21 -05:00
Jon Chambers 2cc6c959a5 Revert "Temporarily suspend reporting of Lettuce latency metrics."
This reverts commit 2045153495a823b06334e7cbd86fb89c946c1cea.
2020-11-11 13:05:49 -05:00
Jon Chambers fb9aa672c9 Include the name of the calling thread when a command times out. 2020-11-11 13:05:35 -05:00
Jon Chambers 325e65db7f Expand UA parsing tests to cover OS details in desktop strings. 2020-11-11 13:05:18 -05:00
Jon Chambers 103b49ec45 Record the number of non-success responses from clients when sending messages via websockets. 2020-11-10 11:47:57 -05:00
Jon Chambers 6c78d7544f Capture a thread dump when Redis commands time out. 2020-11-10 11:47:39 -05:00
Jon Chambers 4d5fbec5a5 Add a debug version of Lettuce to track down the cause of https://github.com/lettuce-io/lettuce-core/issues/1494. 2020-11-10 11:45:46 -05:00
Jon Chambers 7cf50a15d0 Include client age/UA string when closing due to a spurious keepalive request. 2020-11-10 11:45:12 -05:00
Jon Chambers adbc4e9fec Record the platforms of clients that send a keepalive without a local presence. 2020-11-10 11:45:12 -05:00
Jon Chambers 4815434dd7 Record the platforms of clients that are getting displaced. 2020-11-10 11:45:12 -05:00
Jon Chambers b25e50bdae Drop API keys from Micrometer configuration. 2020-11-09 09:26:56 -05:00
Ehren Kret 604287244f Update copyright statement on all source files
IntelliJ Copyright Profile used to automate this.
2020-11-04 11:55:35 -05:00
Jon Chambers 4a4a721e90 Log timeouts in addition to incrementing a counter to make it easier to get precise timestamps. 2020-10-30 11:35:59 -04:00
Jon Chambers a4062b338e Count timeouts directly. 2020-10-29 10:51:18 -04:00
Ehren Kret 5587b7d469 Expose gv1-migration on profile endpoint 2020-10-28 13:00:57 -04:00
Ehren Kret 26870d134f Set source UUID when delivering envelopes from message cache/db on websocket 2020-10-28 12:38:32 -04:00
Jon Chambers fb2baad7cc Restore netty-tcnative. 2020-10-28 12:29:30 -04:00
Jon Chambers 0431a2abb1 De-dupe connection event logging messages. 2020-10-28 12:29:14 -04:00
Ehren Kret c2db2d3cbd
Add GV1 Migration capability 2020-10-27 16:17:21 -04:00
Jon Chambers 05d9ec673e
Send push notifications if websockets close before all messages are delivered 2020-10-27 16:02:55 -04:00
Jon Chambers 1732cf9243 Add filters/tasks to enable/disable request logging. 2020-10-23 11:35:06 -04:00
Jon Chambers ab62c19de9 Temporarily suspend reporting of Lettuce latency metrics. 2020-10-23 11:30:42 -04:00
Jon Chambers 96d3a69479 Use container-managed executors for APN/GCM senders. 2020-10-23 11:30:03 -04:00
Jon Chambers 8523bb1ad8 Change the "oversized message" threshold from 64kB to 1MB. 2020-10-23 11:13:19 -04:00
Jon Chambers 169c3d5a0f Update to Pushy 0.14.2. 2020-10-21 15:20:36 -04:00
Jon Chambers 9cffbe3d49 Drop netty-tcnative-boringssl-static as a dependency. 2020-10-21 15:20:36 -04:00
Jon Chambers e6da54d9b8 Resolve build error introduced while merging. 2020-10-20 19:04:44 -04:00
Jon Chambers 0a843dc086 Tighten the "prune peers" interval; move from fixed-rate to fixed-delay scheduling. 2020-10-20 19:00:55 -04:00
Jon Chambers 7b3ed2dcbf Catch exceptions thrown while pruning missing peers. 2020-10-20 19:00:55 -04:00
Jon Chambers 42ed6c3ded Add clients to the "cleanup" list before actually setting their presence keys. 2020-10-20 19:00:55 -04:00
Jon Chambers 23ca011ac1 Record account deletion reasons. 2020-10-20 19:00:34 -04:00
Jon Chambers d82b3dc429 Record a count of deleted accounts by country. 2020-10-20 19:00:34 -04:00
Jon Chambers e391793c58 Remove now-redundant Redis execution time metrics. 2020-10-20 19:00:11 -04:00
Jon Chambers 236cef4b56 Report Lettuce command latency via Micrometer. 2020-10-20 19:00:11 -04:00
Jon Chambers 45687513bf Revert "Revert "Share resources between Lettuce clients.""
This reverts commit 334f509be599fa6a501026e900d912ff7187e150.
2020-10-20 19:00:11 -04:00
Jon Chambers 019ffdaf12 Add a command for dumping Redis command stats. 2020-10-20 18:59:44 -04:00
Jon Chambers 1a57d4fe11 Update to Lettuce 6. 2020-10-20 18:59:26 -04:00
Jon Chambers df847431eb Measure total bytes written to websockets and failed send attempts. 2020-10-20 17:22:30 -04:00
Jon Chambers 99f488d48f Drop websocket connection names (unused for a while now). 2020-10-19 11:24:35 -04:00
Jon Chambers 05929871c9 Rename PushSender to MessageSender and add docs. 2020-10-19 11:24:35 -04:00
Jon Chambers 74b3daa70a Collapse WebsocketSender into PushSender. 2020-10-19 11:24:35 -04:00
Jon Chambers 5e30b0499a Move provisioning message-sending to its own manager class. 2020-10-19 11:24:35 -04:00
Jon Chambers 85c7347899 Add a command for dumping Redis SLOWLOG output. 2020-10-15 12:18:37 -04:00
Jon Chambers 3a84775912 Log cluster topology change events, too. 2020-10-13 16:07:08 -04:00
Jon Chambers 290a82e61c Log when Lettuce connection events happen. 2020-10-13 16:07:08 -04:00
Jon Chambers adac7d7fb2 Estimate the size of message entity lists sent via the REST API. 2020-10-13 15:49:11 -04:00
Jon Chambers 52320ebb91 Revert "Share resources between Lettuce clients."
This reverts commit eab1f503a5.
2020-10-13 12:44:54 -04:00
Jon Chambers eab1f503a5 Share resources between Lettuce clients. 2020-10-11 14:36:28 -04:00
Jon Chambers a9d0aa136d Add OS-reported metrics for cached/buffered memory. 2020-10-11 13:43:15 -04:00
Jon Chambers 691ab3080d Fix some metrics names/types. 2020-10-11 12:37:17 -04:00
Jon Chambers c5147e0c68 Report direct memory metrics. 2020-10-11 11:37:51 -04:00
Jon Chambers e9b0829860 Report the maximum number of file descriptors allowed by the OS. 2020-10-11 11:27:57 -04:00
Jon Chambers 95428ab8b0 Report GC metrics. 2020-10-11 11:08:24 -04:00
Jon Chambers 775d56fe52 Drop the "repair message queue metadata" script. 2020-10-09 18:18:30 -04:00
Jon Chambers ac2ff29288 Make sure to close scheduled reporters. 2020-10-09 18:05:00 -04:00
Jon Chambers 8e1975efe4 Record the number of deletable accounts per crawled chunk. 2020-10-08 10:51:41 -04:00
Curt Brune 39c09733d3
Add /v1/payments/auth endpoint 2020-10-08 10:51:01 -04:00
Jon Chambers e1c397993d
Require Android clients to support the gv2-3 capability 2020-10-06 16:49:49 -04:00
Jon Chambers 58ca4baf71 Time account deletion operations. 2020-10-06 11:04:47 -04:00
Jon Chambers 5245b68689 Remove temporary metrics. 2020-10-06 11:04:47 -04:00
Jon Chambers 2b6811cb1b Really delete old accounts instead of just removing their push channels. 2020-10-06 11:04:47 -04:00
Jon Chambers c82496b972 Remove the "repair queue metadata" script. 2020-10-05 16:57:16 -04:00
Jon Chambers c31348ea9a Drop the "insert messages" timeout. 2020-10-05 16:57:01 -04:00
Jon Chambers c885540749 Check that the return of ZRANGEBYSCORE isn't an empty list. 2020-10-05 10:38:40 -04:00
Jon Chambers bb087caddc Don't panic if a queue exists, but is empty when repairing metadata. 2020-10-04 16:09:56 -04:00
Jon Chambers 5e3f8b9c2e Disallow insertion of duplicate messages. 2020-10-04 15:34:14 -04:00
Jon Chambers 1ccfe928f7 Add a test to make sure that we don't double-insert messages with the same GUID. 2020-10-04 15:34:14 -04:00
Jon Chambers 3016269268 Revert "Temporarily disable the message persisters entirely."
This reverts commit d464721397.
2020-10-04 15:25:06 -04:00
Jon Chambers 952cfae4e6 Repair queue metadata before persisting queues. 2020-10-04 15:25:06 -04:00
Jon Chambers df7f209ebc Revert "Don't insert message batches in transactions."
This reverts commit 16eefe333f.
2020-10-04 15:12:15 -04:00
Jon Chambers d464721397 Temporarily disable the message persisters entirely. 2020-10-04 11:44:35 -04:00
Jon Chambers 551a85c1e6 Use named variables instead of referring to KEYS/ARGV array indices in message cache scripts. 2020-10-04 11:27:27 -04:00
Jon Chambers 2686761608 Instrument "get queues to persist" calls and "persist queues" exceptions. 2020-10-04 10:48:42 -04:00
Jon Chambers 02a2c3224f Discard unused feature flag constants/mocking. 2020-10-04 10:48:42 -04:00
Jon Chambers 8ec1dda9ba Give the persister worker thread a meaningful name. 2020-10-04 10:48:42 -04:00
Jon Chambers 0308532523 Set a query timeout of 5 seconds when inserting batches of messages. 2020-10-04 10:48:42 -04:00
Jon Chambers 10b3af2947 Revert "Insert messages individually."
This reverts commit 158bfe4816.
2020-10-04 10:48:42 -04:00
Jon Chambers 158bfe4816 Insert messages individually. 2020-10-03 13:13:34 -04:00
Jon Chambers 16eefe333f Don't insert message batches in transactions. 2020-10-03 11:43:42 -04:00
Jon Chambers 65e585e122 Pause only if we're running low on queues to persist. 2020-10-03 11:43:34 -04:00
Jon Chambers 2ba36ee04c Add a gauge for worker thread liveness. 2020-10-03 11:43:34 -04:00
Jon Chambers fc05529574 Let MessagePersister manage its own worker thread. 2020-10-03 11:43:34 -04:00
Jon Chambers 07d24f487a Don't re-register metrics for shared circuit breakers. 2020-10-02 15:05:00 -04:00
Jon Chambers 811acdb7f5 Use separate namespaces for Redis breaker/retry metrics. 2020-10-02 10:57:05 -04:00
Jon Chambers a7266364d1 Refactor peer pruning to be more retry-friendly. 2020-10-01 17:17:07 -04:00
Jon Chambers e83b41dc01 Reduce default Redis cluster command timeout to 3 seconds. 2020-10-01 17:17:07 -04:00
Jon Chambers 76665dd56e Retry Redis commands that time out. 2020-10-01 17:17:07 -04:00
Jon Chambers 2d42b478ba Consolidate cluster and pub/sub circuit breakers. 2020-10-01 17:17:07 -04:00
Jon Chambers 885fa6beae Add tests for Device#isEnabled. 2020-10-01 12:54:35 -04:00
Jon Chambers 65cdd5fcbe Drop the 365-day check when deciding if an account is enabled. 2020-10-01 12:54:35 -04:00
Jon Chambers 4302e19aba Register a UUID argument factory for the messages database. 2020-10-01 11:06:43 -04:00
Jon Chambers 0c6f05f34a Add a (failing!) test for sending a sealed-sender message after a non-sealed-sender message. 2020-10-01 11:06:43 -04:00
Jon Chambers 8040c285cd Include stack traces when reporting persistence issues. 2020-09-30 11:47:16 -04:00
Jon Chambers ada454f56f Add a meter for persisting individual messages. 2020-09-30 10:39:56 -04:00
Jon Chambers 57d2ef8740 Return queues to the "to persist" list if something goes wrong during persistence. 2020-09-30 10:39:56 -04:00
Jon Chambers a97e0982e3 Add an integration test for message persistence. 2020-09-30 10:39:56 -04:00
Jon Chambers eaa2060d84 Fix an incorrect locking key and some previously-suppressed lock contention issues. 2020-09-30 10:39:56 -04:00
Jon Chambers 3e02c574e7 Log exceptions when persisting messages. 2020-09-30 10:39:56 -04:00
Jon Chambers c7230ccbb0 Remove messages from the cache in bulk. 2020-09-29 10:58:02 -04:00
Jon Chambers fc71ced660 Persist messages in batches. 2020-09-29 10:58:02 -04:00
Jon Chambers 6041a9d094 Make exit conditions slightly more conservative. 2020-09-29 10:58:02 -04:00
Jon Chambers 599cd766e1 Let Dropwizard manage persister thread lifecycles. 2020-09-29 10:58:02 -04:00
Alan Evans e64c8007c0
Detect GV2 capability in non-gcm Android devices 2020-09-28 15:54:10 -04:00
Jon Chambers 9339823e84 Add temporary metrics to monitor the ratio of enabled/disabled accounts. 2020-09-28 15:33:52 -04:00
Jon Chambers e6d4620af1 Only allow linking desktop clients if they support the third-generation GV2 capability. 2020-09-25 17:08:32 -04:00
Jon Chambers 656e6db846 Only consider desktop devices GV2-capable if they send the third-gen GV2 capability. 2020-09-25 17:08:32 -04:00
Jon Chambers 30474e3a2b Add a test for message ordering. 2020-09-25 11:41:58 -04:00
Jon Chambers 460bd98f1b Add metrics for messages missing GUIDs. 2020-09-25 11:41:22 -04:00
Jon Chambers a553eba574 Add an API endpoint for deleting accounts. 2020-09-25 11:39:17 -04:00
Jon Chambers 61f515670c Add plumbing for deleting accounts and all associated data. 2020-09-25 11:39:17 -04:00
Jon Chambers 789af0f8a6 Add support for deleting keys associated with an account. 2020-09-25 11:39:17 -04:00
Jon Chambers 86fae58c96 Add support for deleting account entities from the database. 2020-09-25 11:39:17 -04:00
Jon Chambers c54d3abe47 Check for the second-gen GV2 capability when linking devices. 2020-09-24 19:04:02 -04:00
Jon Chambers 6fe511eb50 Fix a bad size check when loading stored messages. 2020-09-23 18:02:33 -04:00
Jon Chambers 17d18b22c7 Drop pub/sub sending logic from WebsocketSender. 2020-09-23 14:51:02 -04:00
Jon Chambers 66a04ed730 Don't explicitly notify clients when messages get persisted. 2020-09-23 14:51:02 -04:00
Jon Chambers 7e14a0bc30 Drop pub/sub operations from WebsocketConnection. 2020-09-23 14:51:02 -04:00
Jon Chambers 77de0f86dc Require desktop clients to send the new gv2-2 capability flag. 2020-09-23 12:05:58 -04:00
Jon Chambers 3b4bc9163a Untangle thread pool names, tweak sizes, and add instrumentation. 2020-09-22 10:21:33 -04:00
Jon Chambers e146135bd1 Don't attempt to send more messages if sending failed for any reason. 2020-09-22 10:21:33 -04:00
Jon Chambers e9e18afb4a Add a (failing) integration test demonstrating an infinite loop. 2020-09-22 10:21:33 -04:00
Jon Chambers 62c31eb202 Revert "Revert keyspace delivery for all messages"
This reverts commit 4dc49604b6.
2020-09-22 10:21:33 -04:00
Jon Chambers 1eacee85ae Count how many iOS users set the old GV2 capability flag. 2020-09-21 18:58:07 -04:00
Jon Chambers 5986145282 Add a second-generation GV2 capability and ignore the old capability for iOS devices. 2020-09-21 18:57:53 -04:00
Jon Chambers b134a69a28 Record the number of authentications for users with/without GV2 support. 2020-09-21 15:42:13 -04:00
Jon Chambers 83f9eacac4 Refactor UserAgentTagUtil to parse UA strings with UserAgentUtil. 2020-09-21 12:24:08 -04:00
Jon Chambers baab6b951b Add a general utility class for parsing user-agent strings. 2020-09-21 12:24:08 -04:00
Jon Chambers b041fbe3ec Add semver4j as a dependency. 2020-09-21 12:24:08 -04:00
Jon Chambers 903a1bec91 Reject (eventually) oversize messages. 2020-09-17 17:07:20 -04:00
Jon Chambers ebc3a251b7 Drop the UUID addressing capability flag entirely. 2020-09-14 15:36:29 -04:00
Jon Chambers a567f4a6de Don't check UUID capability when blocking capability downgrades. 2020-09-14 15:36:29 -04:00
Jon Chambers 4dc49604b6
Revert keyspace delivery for all messages
* Revert "Send all messages via keyspace notifications when a feature flag is enabled."

This reverts commit fadcf62166.

* Revert "Consolidate semaphore release logic."

This reverts commit c02b255766.

* Revert "Represent stored message state as an enumeration rather than a collection of booleans."

This reverts commit 89788fa665.

* Revert "Refactor: collapse state into semaphores/atomic booleans."

This reverts commit a052e2ee8f.

* Revert "Refactor: move sendNextMessagePage into its own method."

This reverts commit 158e5004b7.

* Revert "Avoid querying the database if we think all new messages are in the cache."

This reverts commit 6f9ff3be37.

* Revert "Query for more stored messages if an update happens while we're already processing a batch."

This reverts commit f766c57743.

* Revert "Only send the "queue cleared" message once per websocket session."

This reverts commit 8f53152c3e.

* Revert "Let processStoredMessages handle requery logic."

This reverts commit 7bbc88d716.

* Revert "Only allow one thread to process stored messages at a time."

This reverts commit 68256d2343.
2020-09-14 15:35:10 -04:00
Jon Chambers fadcf62166 Send all messages via keyspace notifications when a feature flag is enabled. 2020-09-11 13:12:17 -04:00
Jon Chambers c02b255766 Consolidate semaphore release logic. 2020-09-11 13:12:17 -04:00
Jon Chambers 89788fa665 Represent stored message state as an enumeration rather than a collection of booleans. 2020-09-11 13:12:17 -04:00
Jon Chambers a052e2ee8f Refactor: collapse state into semaphores/atomic booleans. 2020-09-11 13:12:17 -04:00
Jon Chambers 158e5004b7 Refactor: move sendNextMessagePage into its own method. 2020-09-11 13:12:17 -04:00
Jon Chambers 6f9ff3be37 Avoid querying the database if we think all new messages are in the cache. 2020-09-11 13:12:17 -04:00
Jon Chambers f766c57743 Query for more stored messages if an update happens while we're already processing a batch. 2020-09-11 13:12:17 -04:00
Jon Chambers 8f53152c3e Only send the "queue cleared" message once per websocket session. 2020-09-11 13:12:17 -04:00
Jon Chambers 7bbc88d716 Let processStoredMessages handle requery logic. 2020-09-11 13:12:17 -04:00
Jon Chambers 68256d2343 Only allow one thread to process stored messages at a time. 2020-09-11 13:12:17 -04:00
Ehren Kret f88c440c48
Automatically retry when Twilio returns unreachable (#190)
* Parse and log the Twilio error code

* Automatically retry without sender ID when Twilio returns unreachable

* Remove attempt count and pass around whether or not sender id was used
2020-09-10 13:58:39 -05:00
Jon Chambers cfa56ba6d4 Remove the "send online messages via keyspace notifications" feature flag. 2020-09-10 10:41:20 -04:00
Jon Chambers 2c6b646d87 Enforce no capability downgrade on device verification 2020-09-09 16:05:00 -04:00
Jon Chambers e7572094b5 Require all enabled devices to support GV2. 2020-09-09 16:05:00 -04:00
Jon Chambers 5e34823a49 Optionally send online-only messages via keyspace notifications. 2020-09-09 14:42:09 -04:00
Jon Chambers fdef21a871 Record and listen for ephemeral messages in a separate queue. 2020-09-09 14:42:09 -04:00
Jon Chambers d40cff8a99 Revert "Add a system for storing, retrieving, and notifying listeners about ephemeral (online) messages."
This reverts commit 06754d6158.
2020-09-08 15:55:09 -04:00
Jon Chambers 8927e45ded Revert "Optionally send online-only messages via keyspace notifications."
This reverts commit 12fe28d8ab.
2020-09-08 15:55:09 -04:00
Jon Chambers 1a93df92d4 Replace DeliveryStatus with a simple boolean. 2020-09-08 11:29:33 -04:00
Jon Chambers 12fe28d8ab Optionally send online-only messages via keyspace notifications. 2020-09-08 11:19:55 -04:00
Jon Chambers 06754d6158 Add a system for storing, retrieving, and notifying listeners about ephemeral (online) messages. 2020-09-08 11:14:42 -04:00
Jon Chambers 1d5087374e Jettison UUID-or-E164 plumbing in favor of UUID-only. 2020-09-08 09:30:47 -04:00
Jon Chambers 8356264fe0 Rename RedisClusterMessagesCache and related classes to just MessagesCache. 2020-09-08 09:30:47 -04:00
Jon Chambers 18ecd748dd Entirely discard the old message cache machinery. 2020-09-08 09:30:47 -04:00
Jon Chambers e324f27655 Stop sending/processing CONNECTED pub/sub messages. 2020-09-03 13:52:43 -04:00
Jon Chambers afd645fb11 Retrieve messages using commands available in Redis 3. 2020-09-03 13:31:55 -04:00
Jon Chambers 5b42593fbb Persist messages one page at a time. 2020-09-03 12:08:46 -04:00
Jon Chambers 25f3c6a548 Drop our dependency on commons-pool. 2020-09-03 11:05:10 -04:00
Jon Chambers 5c04f2634a Use a dedicated executor service for dispatching keyspace notifications. 2020-09-03 11:04:48 -04:00
Jon Chambers ad01610d1e Rely on the client presence manager to decide whether to send push notifications. 2020-09-03 11:04:48 -04:00
Jon Chambers 697c380cd1 Close websocket connections when displaced. 2020-09-03 11:04:48 -04:00
Jon Chambers 81e8143a43 Rely solely on the clustered message cache. 2020-09-02 11:57:33 -04:00
Jon Chambers 8409986ef5 Mirror persistence operations from the new persister to the old persister. 2020-09-02 11:02:40 -04:00
Jon Chambers 2b50367d7f Put message persisters behind feature flags. 2020-09-02 11:02:40 -04:00
Jon Chambers 1dcc491fec Move cache-mirroring operations to the calling thread. 2020-09-01 12:34:37 -04:00
Ehren Kret d715f86713 Refactor to constants 2020-09-01 10:55:26 -04:00
Ehren Kret 5221828705 Increase maximum sticker size to 300 kibibytes
In preparation for animated stickers, allow stickers to be up to 300
kibibytes.
2020-09-01 10:55:26 -04:00
Jon Chambers 6aa4acd3db Mirror "clear queue" operations to the clustered cache. 2020-09-01 10:55:07 -04:00
Jon Chambers 15936c29c1 Let Dropwizard manage the lifecycle of the feature flag manager. 2020-09-01 10:50:59 -04:00
Jon Chambers 8b70c69a0d Replace metrics with logging statements. 2020-08-31 15:57:17 -04:00
Jon Chambers dfe80a30dc Make ScourMessageCacheCommand a ConfiguredCommand instead of an EnvironmentCommand. 2020-08-31 15:57:17 -04:00
Jon Chambers ce026e7ad0 Don't send contacts to CDS if they've opted out of discoverability. (SERVER-130) 2020-08-27 15:58:02 -04:00
Jon Chambers 58e3122dab Add a discoverableByPhoneNumber account attribute. (SERVER-129) 2020-08-27 15:58:02 -04:00
Jon Chambers 3b55b2d1b2 Actually make the "scour message cache" available to Dropwizard. Oops. 2020-08-27 15:15:04 -04:00
Jon Chambers 2326e61de5 Clear and re-create gauges to avoid "stuck" feature flag reporting. 2020-08-27 13:18:12 -04:00
Jon Chambers 32b18c9509 Add an endpoint for getting the current state of feature flags. 2020-08-27 13:18:12 -04:00
Jon Chambers acf52ad8a3 Make feature flag manager tests use a real database to avoid over-mocking. 2020-08-27 13:18:12 -04:00
Jon Chambers 08dd493f98 Don't report exceptions as part of traffic metrics. 2020-08-27 13:17:57 -04:00
Jon Chambers 07bbe7dfb2 Return to an async model for push notification latency. 2020-08-27 10:51:44 -04:00
Jon Chambers 0aa1b80e3e Add a command for persisting any detached messages in the old message cache. 2020-08-27 10:51:12 -04:00
Jon Chambers 5ac390281e Add an abstract base class for Redis singleton tests. 2020-08-27 10:51:12 -04:00
Jon Chambers ac465c5a18 Add a Lettuce-based Redis singleton client. 2020-08-27 10:51:12 -04:00
Jon Chambers 1ef3546822
Add support for server-side feature flags 2020-08-26 20:27:33 -04:00
Jon Chambers e74ad2b555 Make RedisClusterMessagesCache a Managed class. 2020-08-25 10:58:01 -04:00
Jon Chambers 71c0056c66 Use lots of specific subscriptions instead of one monster subscription to minimize load. 2020-08-25 10:58:01 -04:00
Jon Chambers 56b27ea785 Record experiment outcomes with timers instead of counters. 2020-08-25 10:57:44 -04:00
Jon Chambers 2d75f59d33 Add support for UUID-only delivery certificates. (SERVER-132) 2020-08-20 17:05:53 -04:00
Jon Chambers a709a3bcc0 Remove a candidate metric provider. 2020-08-20 15:40:56 -04:00
Jon Chambers 34bf5112e0 Drop TimeProvider. 2020-08-20 15:40:24 -04:00
Jon Chambers bfe18d1d28 Re-nerf the clustered message persister. 2020-08-20 15:38:09 -04:00
Jon Chambers 6a76afc20d Add a test to make sure the persister is respecting persist delays. 2020-08-20 15:38:09 -04:00
Jon Chambers 9c469c2f96 Base persister tests on a real Redis cluster. 2020-08-20 15:38:09 -04:00
Jon Chambers 2ab42f3dd6 Refine and expand clustered message cache metrics. 2020-08-19 11:39:05 -04:00
Jon Chambers af34b43a8d Reactivate the message notification experiment. 2020-08-19 11:39:05 -04:00
Jon Chambers 0f71cc7864 Rename metrics associated with cluster circuit breakers for clarity. 2020-08-18 17:59:00 -04:00
Jon Chambers df90de3a5f Change default Lettuce command timeout to 10s. 2020-08-18 16:21:42 -04:00
Jon Chambers 42ea7a9814 Revert Lettuce connection pooling. 2020-08-18 16:21:42 -04:00
Jon Chambers c683cbdb2d Time Redis operations. 2020-08-18 12:20:12 -04:00
Jon Chambers d243b73678 Make Lettuce connection pools configurable. Double the default size. 2020-08-18 12:20:12 -04:00
Jon Chambers dc28d063aa Reactivate the explicit client presence experiment. 2020-08-17 11:34:27 -04:00
Jon Chambers bb6045c1d0 Disarm the client presence manager experiment. 2020-08-15 20:23:05 -04:00
Jon Chambers f1a74b5939 Disarm new message keyspace notifications. 2020-08-15 20:23:05 -04:00
Jon Chambers 6fb9038af1 Move to a synchronous, pooled connection model for Redis clusters. 2020-08-14 17:15:56 -04:00
Jon Chambers 27f721a1f5 Update to resilience4j 1.5.0. 2020-08-14 17:15:56 -04:00
Jon Chambers 5717dc294e Combine the read/write breakers for Redis clusters. 2020-08-14 17:15:56 -04:00
Jon Chambers ae0f8df11b Break out FaultTolerantPubSubConnection as its own thing so different use cases can have their own subscription space. 2020-08-14 17:15:56 -04:00
Jon Chambers 77460ba502 Remove keyspace notification configuration checks because AWS doesn't support `CONFIG GET`. 2020-08-13 15:32:25 -04:00
Jon Chambers f8235da4d8 Fix an issue where the queue for a thread pool was not bounded. 2020-08-13 12:46:11 -04:00
Jon Chambers 8d3316ccd6 Listen for new messages via keyspace notifications. 2020-08-13 12:17:04 -04:00
Jon Chambers 2c29f831e8 Add an explicit client presence system. 2020-08-13 10:56:26 -04:00
Jon Chambers 9457325119 Add pub/sub affordances to FaultTolerantRedisCluster. 2020-08-13 10:56:26 -04:00
Jon Chambers 189f8afcc9 Warm up the test cluster before running tests to avoid transient startup jitters. 2020-08-13 10:56:26 -04:00
Jon Chambers f3a34990ab Update to Lettuce 5.3.3. 2020-08-12 16:57:23 -04:00
Jon Chambers 9699b67510 Record the size of outgoing message lists. 2020-08-12 16:57:10 -04:00
Jon Chambers d60633a46c Add a meter for the number of messages we send via websocket connections. 2020-08-12 16:57:10 -04:00
Jon Chambers 0fcf28e7e7 Use the MessagesManager to actually persist messages. 2020-08-11 15:50:22 -04:00
Jon Chambers 5fad8f74b1 Factor MessagePersister into its own class. 2020-08-11 15:50:22 -04:00
Jon Chambers e35e34d2e0 Move operation-mirroring logic to MessagesManager. 2020-08-11 15:50:22 -04:00
Jon Chambers 31a215d4d6 Use "global." instead of "g." as the prefix for global config options. 2020-08-11 11:55:35 -04:00
Jon Chambers 30948de13d Update a metric provider dependency and remove a workaround for an upstream issue. 2020-08-11 11:02:38 -04:00
Ehren Kret b97158bf7b
Create global remote config controllable in the signal server configuration (#127)
* Add global config controller through file rather than database

* Do no permit attempting to set or delete global config entries
2020-08-10 16:31:15 -05:00
Jon Chambers 6646be8d94 Make CpuUsageGauge a CachedGauge. 2020-08-10 12:56:37 -04:00
Jon Chambers 647a2aea64 Cache a reference to the OS management bean to avoid repeated lookups. 2020-08-10 12:56:37 -04:00
Jon Chambers 58e58ce51c Remove a candidate metric provider. 2020-08-10 11:03:20 -04:00
Ehren Kret 4b7e48d3ec
Override default ingestion URI for SignalFx (#131) 2020-08-07 15:29:42 -05:00
Ehren Kret 0e074d3a5a Copy SignalFxMeterRegistry into a new class to get better logging 2020-08-07 16:01:56 -04:00
Ehren Kret ea00224e7f
Add support for reporting metrics to signalfx (#129) 2020-08-07 11:10:31 -05:00
Jon Chambers 38293efe75 Keep a running count of the number of open websockets. 2020-08-06 16:07:34 -04:00
Jon Chambers 3286c5e174 Disable Redis persistence for tests. 2020-08-06 11:22:51 -04:00
Jon Chambers e0f8a28f38 Close connections before closing the whole cluster client. 2020-08-06 11:22:31 -04:00
Jon Chambers bf1b00b163 Drop a spurious RedisClusterClient. 2020-08-06 11:22:31 -04:00
Ehren Kret 4fa3a136ad
Remove arbitrary SMS and add a NANPA message service (#123)
* Remove arbitrary SMS code

This code has run its course and is no longer needed for now.

* Add elements to sample config that were left out

* Add a messaging service for NANPA

* Fixup sample config capitalization
2020-08-05 13:35:11 -05:00
Jon Chambers 178a6bd66e Log the top-level exception name and message when crawling badness happens. 2020-08-05 11:23:16 -04:00
Ehren Kret 57e1339230
Further restrict user agent pattern matching (#120)
* Further restrict user agent pattern matching

* Add static qualifier to method
2020-08-04 12:58:16 -05:00
Jon Chambers 4144423227 Publish percentiles for Micrometer distributions/timers. 2020-08-04 10:58:59 -04:00
Jon Chambers 4d03514142 Add a command for clearing the messages cache cluster. 2020-08-04 10:58:41 -04:00
Jon Chambers 0bc5566976 Mirror delete-after-persist operations to the clustered message cache. 2020-08-04 10:58:41 -04:00
Jon Chambers 925567add5 Actually "plug in" the reglock counter. 2020-08-03 15:43:33 -04:00
Jon Chambers ad97731d46 Reduce the maximum number of versions in play to 1,000. 2020-08-03 15:42:15 -04:00
Jon Chambers 40684a93a2 Restrict user-agent version matching to a more confined space. 2020-08-03 15:42:15 -04:00
Jon Chambers f3b644ceb8 Update the push latency manager to use UUIDs and a Redis cluster. 2020-08-03 15:36:02 -04:00
Jon Chambers 901ba6e87f Added a push latency manager. 2020-08-03 15:36:02 -04:00
Jon Chambers 76389bd584 Clear would-be-persisted messages from the cache cluster, but don't store them to the database. 2020-07-30 19:14:39 -04:00
Jon Chambers 7bf8650d59 Un-manage FaultTolerantRedisCluster so it shuts down at JVM shutdown instead of Jetty shutdown. 2020-07-30 18:37:38 -04:00
Ehren Kret 7cb24dd96d Add environment tag to datadog metric reporting 2020-07-30 18:04:16 -04:00
Ehren Kret dee040318a Add the host tag to datadog metric reporting 2020-07-30 18:04:16 -04:00
Jon Chambers baf563e46d Temporarily disarm the actual persisting part of the message persister. 2020-07-30 17:12:37 -04:00
Jon Chambers e10246f10b Use Dropwizard timers/histograms for persister metrics. 2020-07-30 14:27:06 -04:00
Jon Chambers a9dfd88671 Start the clustered message persister at application startup. 2020-07-30 12:32:35 -04:00
Jon Chambers beac73b6c8 Add a cluster-capable message persister 2020-07-30 11:39:14 -04:00
Jon Chambers f9f93c77e2 Use UUIDs instead of phone numbers as account identifiers in clustered message cache 2020-07-30 11:39:14 -04:00
Jon Chambers 6fc1b4c6c0 Add a cluster-backed message cache. 2020-07-30 11:39:14 -04:00
Jon Chambers 639898ec07 Expand Experiment to deal with async suppliers and Optionals. 2020-07-30 11:39:14 -04:00
Jon Chambers 3d3790fdbc Add binary execution methods to ClusterLuaScript. 2020-07-30 11:39:14 -04:00
Jon Chambers 69c8968cb0 Add byte-array-based methods to FaultTolerantRedisCluster. 2020-07-30 11:39:14 -04:00
Jon Chambers aa25fc7901 Fix UsernamesManager metric/logger names. 2020-07-29 11:00:29 -04:00
Jon Chambers 4aba493ee2 Fix the key used for database crawler workers. 2020-07-29 10:58:06 -04:00
Jon Chambers b9cfac5934 Introduce additional metric aggregators. 2020-07-28 15:11:51 -04:00
Brian Acton f8e97fcc32 revise 12 hour active user fudge to 8 hours for better continuity of data from a month ago 2020-07-28 11:09:41 -07:00
Jon Chambers 7f8f2641f6 Simplify registration lock counting by avoiding inactive accounts. 2020-07-28 11:48:20 -04:00
Jon Chambers 022dbb606f Count registration lock versions when crawling the account database. 2020-07-28 11:48:20 -04:00
Jon Chambers fea72b190d Record message content size as a dimensioned distribution. 2020-07-28 11:47:56 -04:00
Jon Chambers eea073f882 Decommission the old cache. 2020-07-28 10:29:28 -04:00