[math-fun] GMail is embarrassing me
I said [I have no idea how or why GMail sent two copies of this.] Due to marginal connectivity, > ½ the time I click Send, GMail says "An error occurred and your message was not sent". This sometimes gives me a chance to add a couple of afterthoughts before retrying. But GMail turns out to be lying--the message *was* sent, at least to some recipients, so it looks like I'm bothering you all with resends of trivial afterthoughts. --rwg
There's no theoretical way to prevent this, right? That is, any system in which connectivity might be lost must in some case either claim a message was sent when it wasn't, or that it wasn't when it was? --Michael On Sep 1, 2012 3:53 PM, "Bill Gosper" <billgosper@gmail.com> wrote:
I said [I have no idea how or why GMail sent two copies of this.]
Due to marginal connectivity, > ½ the time I click Send, GMail says "An error occurred and your message was not sent". This sometimes gives me a chance to add a couple of afterthoughts before retrying. But GMail turns out to be lying--the message *was* sent, at least to some recipients, so it looks like I'm bothering you all with resends of trivial afterthoughts. --rwg _______________________________________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
On 9/1/12, Michael Kleber <michael.kleber@gmail.com> wrote:
There's no theoretical way to prevent this, right? That is, any system in which connectivity might be lost must in some case either claim a message was sent when it wasn't, or that it wasn't when it was?
Theory might deny perfection, but Gmail is clearly far far below what's possible. Given a connection that is instantaneously unreliable, but intermittently sustained over the long term, one can get the odds of successful non-duplicated message delivery to within 2^-K of certainty. For example, TCP determines that each message (or "packet") is sent exactly once by requiring that the sender and receiver agree on a sequence number. The sender can't begin trying to send packet N+1 until it receives confirmation of the receipt of packet N. http://en.wikipedia.org/wiki/Transmission_Control_Protocol#Reliable_transmis... The reliability of the algorithm depends on the number of bits and the goodness of the hashfunction. TCP only uses a 16-bit checksum, but of course that can be improved. With CRC-64 everyone in the world could send a million messages before anyone would expect to see error. (For efficiency, modern protocols have a "window", allowing the sender and receiver to be retrying on multiple messages at any given time, so long as the leading edge doesn't get too far ahead of the trailing edge.) -- Robert Munafo -- mrob.com Follow me at: gplus.to/mrob - fb.com/mrob27 - twitter.com/mrob_27 - mrob27.wordpress.com - youtube.com/user/mrob143 - rilybot.blogspot.com
On Sun, Sep 2, 2012 at 3:34 AM, Robert Munafo <mrob27@gmail.com> wrote:
On 9/1/12, Michael Kleber <michael.kleber@gmail.com> wrote:
There's no theoretical way to prevent this, right? That is, any system in which connectivity might be lost must in some case either claim a message was sent when it wasn't, or that it wasn't when it was?
Theory might deny perfection, but Gmail is clearly far far below what's possible.
Given a connection that is instantaneously unreliable, but intermittently sustained over the long term, one can get the odds of successful non-duplicated message delivery to within 2^-K of certainty.
What is K here? And I think that the algorithms you are describing below are the solution to a different problem than the one that Gmail is trying to solve. Let's suppose GMAIL sends the message, does not receive an acknowledgement that the message (or the last part of the message, if you want to look at things at a low enough level that the message is sent piecemeal, but I think that is a distraction here; just look at the last part to be sent, or assume that the message is small enough to fit in a single packet). It could be that the message was sent successfully, and the message went down causing the ack to fail, or it could be that the message was never sent. What should Gmail do now? The communications link might be down for an hour, so the solution "don't tell the user 'message sent' or 'message not sent' for an hour" is not an acceptable one. The user wants to know "was my message sent?" now, not after the link comes up in an hour. You might try to improve the protocol by saying "well, the recipient should ack the message, but not actually deliver it until it receives an acknowledgement that the ack was received", but that just pushes the problem back a level. Now consider the recipient that sends an ack, but the line goes down before it gets an ack of the ack. Now what is it supposed to do? If it delivers the message, then Gmail might never have gotten the ack-of-ack, and will tell its user the message was not delivered, resulting in duplication, or it will not send, even though Gmail got the ack, and Gmail will report delivery when the user never received it, because the receiver is waiting for the ack-of-ack. Somewhere in the protocol is a final message, where if that one is sent and received, the mail is delivered, and if that is not sent, the mail is not delivered, and if it is sent, but the sender doesn't know if it was received or not, Gmail can't deliver a guaranteed-to-be-correct "mail was delivered" or "mail was not delivered" response. You seem below to be giving algorithms for increasing the chance that the message will get delivered eventually, but that's not the issue here. The link is down now, and Gmail wants to tell the user either "the message was sent" or "error; the message was not sent". If both not responding at all to the user until the link is back up is unacceptable, and giving a third response of "maybe the message was sent, and maybe it wasn't; I"ll let you know in an hour" is unacceptable, then gmail must sometimes give the user an inaccurate message, and I don't see how anything you describe below decreases that probability. There's also something very odd about saying that Gmail should be using algorithms like TCP uses to decrease the chance of failure here; normally, Gmail *is* using TCP to communicate between your browser and gmail.com. This doesn't change the fact that if the link goes down, either the message was delivered or not, and your browser cannot know with certainty whether the message was delivered to gmail.com or not. Andy
For example, TCP determines that each message (or "packet") is sent exactly once by requiring that the sender and receiver agree on a sequence number. The sender can't begin trying to send packet N+1 until it receives confirmation of the receipt of packet N.
http://en.wikipedia.org/wiki/Transmission_Control_Protocol#Reliable_transmis...
The reliability of the algorithm depends on the number of bits and the goodness of the hashfunction. TCP only uses a 16-bit checksum, but of course that can be improved. With CRC-64 everyone in the world could send a million messages before anyone would expect to see error.
(For efficiency, modern protocols have a "window", allowing the sender and receiver to be retrying on multiple messages at any given time, so long as the leading edge doesn't get too far ahead of the trailing edge.)
-- Robert Munafo -- mrob.com Follow me at: gplus.to/mrob - fb.com/mrob27 - twitter.com/mrob_27 - mrob27.wordpress.com - youtube.com/user/mrob143 - rilybot.blogspot.com
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
-- Andy.Latto@pobox.com
Thanks, Andy, for expressing what I meant better than I did. If you're willing to relax the time constraints (on both message sending and notification of success), then I suppose the natural thing to do is to assume your browser will indeed have a net connection at some indefinite time in the future, and just assume that the mail will be delivered eventually. I suppose that's what gmail's "Offline Mode" does (Gear icon -> Settings -> Offline): if you have no connection, it will queue up any activity for when the connection comes back. This would probably solve RWG's double-send problem (though not the problem of wanting to add a few more lines to something after sending it). --Michael On Sun, Sep 2, 2012 at 9:36 AM, Andy Latto <andy.latto@pobox.com> wrote:
On Sun, Sep 2, 2012 at 3:34 AM, Robert Munafo <mrob27@gmail.com> wrote:
On 9/1/12, Michael Kleber <michael.kleber@gmail.com> wrote:
There's no theoretical way to prevent this, right? That is, any system in which connectivity might be lost must in some case either claim a message was sent when it wasn't, or that it wasn't when it was?
Theory might deny perfection, but Gmail is clearly far far below what's possible.
Given a connection that is instantaneously unreliable, but intermittently sustained over the long term, one can get the odds of successful non-duplicated message delivery to within 2^-K of certainty.
What is K here?
And I think that the algorithms you are describing below are the solution to a different problem than the one that Gmail is trying to solve. Let's suppose GMAIL sends the message, does not receive an acknowledgement that the message (or the last part of the message, if you want to look at things at a low enough level that the message is sent piecemeal, but I think that is a distraction here; just look at the last part to be sent, or assume that the message is small enough to fit in a single packet). It could be that the message was sent successfully, and the message went down causing the ack to fail, or it could be that the message was never sent. What should Gmail do now? The communications link might be down for an hour, so the solution "don't tell the user 'message sent' or 'message not sent' for an hour" is not an acceptable one. The user wants to know "was my message sent?" now, not after the link comes up in an hour.
You might try to improve the protocol by saying "well, the recipient should ack the message, but not actually deliver it until it receives an acknowledgement that the ack was received", but that just pushes the problem back a level. Now consider the recipient that sends an ack, but the line goes down before it gets an ack of the ack. Now what is it supposed to do? If it delivers the message, then Gmail might never have gotten the ack-of-ack, and will tell its user the message was not delivered, resulting in duplication, or it will not send, even though Gmail got the ack, and Gmail will report delivery when the user never received it, because the receiver is waiting for the ack-of-ack. Somewhere in the protocol is a final message, where if that one is sent and received, the mail is delivered, and if that is not sent, the mail is not delivered, and if it is sent, but the sender doesn't know if it was received or not, Gmail can't deliver a guaranteed-to-be-correct "mail was delivered" or "mail was not delivered" response.
You seem below to be giving algorithms for increasing the chance that the message will get delivered eventually, but that's not the issue here. The link is down now, and Gmail wants to tell the user either "the message was sent" or "error; the message was not sent". If both not responding at all to the user until the link is back up is unacceptable, and giving a third response of "maybe the message was sent, and maybe it wasn't; I"ll let you know in an hour" is unacceptable, then gmail must sometimes give the user an inaccurate message, and I don't see how anything you describe below decreases that probability.
There's also something very odd about saying that Gmail should be using algorithms like TCP uses to decrease the chance of failure here; normally, Gmail *is* using TCP to communicate between your browser and gmail.com. This doesn't change the fact that if the link goes down, either the message was delivered or not, and your browser cannot know with certainty whether the message was delivered to gmail.com or not.
Andy
For example, TCP determines that each message (or "packet") is sent exactly once by requiring that the sender and receiver agree on a sequence number. The sender can't begin trying to send packet N+1 until it receives confirmation of the receipt of packet N.
http://en.wikipedia.org/wiki/Transmission_Control_Protocol#Reliable_transmis...
The reliability of the algorithm depends on the number of bits and the goodness of the hashfunction. TCP only uses a 16-bit checksum, but of course that can be improved. With CRC-64 everyone in the world could send a million messages before anyone would expect to see error.
(For efficiency, modern protocols have a "window", allowing the sender and receiver to be retrying on multiple messages at any given time, so long as the leading edge doesn't get too far ahead of the trailing edge.)
-- Robert Munafo -- mrob.com Follow me at: gplus.to/mrob - fb.com/mrob27 - twitter.com/mrob_27 - mrob27.wordpress.com - youtube.com/user/mrob143 - rilybot.blogspot.com
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
-- Andy.Latto@pobox.com
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
-- Forewarned is worth an octopus in the bush.
Yes, you're right, I missed the point. I think that email clients should tell the (human) sender when the email has been opened by the (human) recipient. The rest of the steps (like sending the email from the browser to one of gmail's servers) are details that are not as important. I would go in the direction of making the user wait a day or two to find out if his message has been read. On 9/2/12, Andy Latto <andy.latto@pobox.com> wrote:
And I think that the algorithms you are describing below are the solution to a different problem than the one that Gmail is trying to solve. [...]
-- Robert Munafo -- mrob.com Follow me at: gplus.to/mrob - fb.com/mrob27 - twitter.com/mrob_27 - mrob27.wordpress.com - youtube.com/user/mrob143 - rilybot.blogspot.com
="Robert Munafo" <mrob27@gmail.com> I think that email clients should tell the (human) sender when the email has been opened by the (human) recipient.
Alas this isn't as simple as we'd hope. Such capability has been available in various guises for a long time; google [return receipt email] and see http://en.wikipedia.org/wiki/Return_receipt There's even an ad: "Receive court admissible certified delivery receipt." I've had several correspondents who've tried to use receipts regularly, but it seems to have generally fallen into disuse. There are many issues: For example if you don't get a receipt back for a while, should you resend, and risk spamming the recipient with multiple copies? Conversely, receipts are a way for spammers to confirm your address is "live" and thus to keep you on their junk mail list--which is then is of higher quality to re-sell to other spammers, etc. (Automatic download and rendering of embedded images in HTML messages has the same problem--the spammer gets the image download request back and thereby knows there's someone who opens messages at that eMail address.) I've also delayed allowing receipts to go back to avoid having the sender mistake "opened to glance at the summary" with "opened and actually read". Systems managed inside a single organization that can track message status as a bit of shared global state rather than by exchanging yet more messages are of course more tractable; the wilds of the web are a different story.
I suspect, but do not know for sure, that if rwg did not make additional edits, he would not get "duplicates". It is probably the additional postscripts that make it a new, different message, so gmail won't invoke a duplicate suppression algorithm. Both TCP and SMTP have duplicate message suppression as part of the protocol. On Sep 2, 2012 8:49 AM, "Robert Munafo" <mrob27@gmail.com> wrote:
Yes, you're right, I missed the point.
I think that email clients should tell the (human) sender when the email has been opened by the (human) recipient. The rest of the steps (like sending the email from the browser to one of gmail's servers) are details that are not as important.
I would go in the direction of making the user wait a day or two to find out if his message has been read.
On 9/2/12, Andy Latto <andy.latto@pobox.com> wrote:
And I think that the algorithms you are describing below are the solution to a different problem than the one that Gmail is trying to solve. [...]
-- Robert Munafo -- mrob.com Follow me at: gplus.to/mrob - fb.com/mrob27 - twitter.com/mrob_27 - mrob27.wordpress.com - youtube.com/user/mrob143 - rilybot.blogspot.com
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
On Sep 2, 2012 8:49 AM, "Robert Munafo" <mrob27@gmail.com> wrote:
I think that email clients should tell the (human) sender when the email has been opened by the (human) recipient.
I would hope not. This would provide live address verification for spam. Also like real mail, you sending me a message should allow me the freedom to read that mail, or not, without your knowledge. Finally, opening a mail does not mean the mail was read. Microsoft email has the return receipt feature, and HTML mail uses hidden image "bugs", to implement this type of thing; both are odious invasions of the recipient's privacy.
participants (6)
-
Andy Latto -
Bill Gosper -
Marc LeBrun -
Michael Kleber -
Robert Munafo -
Tom Rokicki