@nybble41

nybble41@programming.dev · 3 months ago

In general integer division is implemented using a form of long division, in binary. There is no base-10 arithmetic involved. It’s a relatively expensive operation which usually requires multiple clock cycles to complete, whereas dividing by a power of two (“bit shifting”) is trivial and can be done in hardware simply by routing the signals appropriately, without any logic gates.

nybble41@programming.dev · 3 months ago

The metric standard is to measure information in bits.

Bytes are a non-metric unit. Not a power-of-ten multiple of the metric base unit for information, the bit.

If you’re writing “1 million bytes” and not “8 million bits” then you’re not using metric.

If you aren’t using metric then the metric prefix definitions don’t apply.

There is plenty of precedent for the prefixes used in metric to refer to something other than an exact power of 1000 when not combined with a metric base unit. A microcomputer is not one one-thousandth of a computer. One thousand microscopes do not add up to one scope. Megastructures are not exactly one million times the size of ordinary structures. Etc.

Finally: This isn’t primarily about bit shifting, it’s about computers being based on binary representation and the fact that memory addresses are stored and communicated using whole numbers of bits, which naturally leads to memory sizes (for entire memory devices or smaller structures) which are powers of two. Though the fact that no one is going to do something as idiotic as introducing an expensive and completely unnecessary division by a power of ten for every memory access just so you can have 1000-byte MMU pages rather than 4096 also plays a part.

nybble41@programming.dev · 5 months ago

If it averages several instances, with enough signal you could decompose a linear combination (e.g. average) of different patterns back out into its constituent parts.

A smarter system won’t just take the mean of the votes from different instances but rather discard outliers as invalid input (flagging repeat offenders to be ignored in the future) and use the median or mode of the remainder. The results should also be quantitized to avoid leaking details about sources or internal algorithms; only the larger trends need to be reported.

Of course you could always just keep the collected data private and only provide it to customers willing to pay $$$ for access, which handily limits instance operators’ ability to reverse-engineer the source of the data. And nothing prevents you from using separate instances for public and private data sets.

nybble41@programming.dev · 5 months ago

It would be a nominal charge for storage, bandwidth, and indexing. Book stores carry public-domain titles, for profit, and most have no issue with that. You can always procure the same files somewhere else—they are public domain, after all. Those who pay are doing so for the convenience, not because they’re forced to.

nybble41@programming.dev · 5 months ago

They could stick to public domain & indie titles. They won’t, but they could.

nybble41@programming.dev · 7 months ago

In what sense do you think this isn’t following the email standard? The plus sign is a valid character in the local part, and the standard doesn’t say how it should be interpreted (it could be a significant part of the name; it’s not proper to strip it out) or preclude multiple addresses from delivering to the same mailbox.

Unfortunately the feature is too well-known, and the mapping from the tagged address to the plain address is too transparent. Spammers will just remove the label. You need either a custom domain so you can use a different separator (‘+’ is the default but you can generally choose something else for your own server) or a way to generate random, opaque temporary addresses.

If you want to talk about non-compliant address handing, aside from not accepting valid addresses, the one that always bothers me is sites that capitalize or lowercase the local part of the address. Domain names are not case-sensitive, but the local part is. Changing the case could result in non-delivery or delivery to the wrong mailbox. Most servers are case-insensitive but senders shouldn’t assume that is always true.

nybble41@programming.dev · 7 months ago

CVS and E*Trade both refused to accept my fairly standard user@mydomain.info address during initial registration, but had no issue changing to that address once the account was created. It would be nice if their internal teams communicated a bit better.

nybble41@programming.dev · 7 months ago

deleted by creator

nybble41@programming.dev · 7 months ago

Look up the legal principle of estoppel. In general you can’t turn around and sue someone for doing something after informing them (in writing no less) that you’re okay with it, even if you would otherwise have had a valid basis to sue.

nybble41@programming.dev · 8 months ago

They ruled that people acting together have all the same rights that they would have acting individually, and that preventing someone from spending money on producing and promoting their speech effectively prevents them from being heard. Which are both perfectly true, common-sense statements.

nybble41@programming.dev · 8 months ago

If you can read emails sent to a given address, and send replies from that address, it basically is your email address for all practical purposes no matter who was meant to be using the account. This is not necessarily a good thing and better end-to-end security would be nice but it is what it is. Odds are the app itself would let anyone change the password and log in provided they can read the emails, unless it’s using some form of 2FA.

nybble41@programming.dev · 8 months ago

So you’re not remapping the source ports to be unique? There’s no mechanism to avoid collisions when multiple clients use the same source port? Full Cone NAT implies that you have to remember the mapping (potentially indefinitely—if you ever reassign a given external IP:port combination to a different internal IP or port after it’s been used you’re not implementing Full Cone NAT), but not that the internal and external ports need to be identical. It would generally only be used when you have a large enough pool of external IP addresses available to assign a unique external IP:port for every internal IP:port. Which usually implies a unique external IP for each internal IP, as you can’t restrict the number of unique ports used by each client. This is why most routers only implement Symmetric NAT.

(If you do have sufficient external IPs the Linux kernel can do Full Cone NAT by translating only the IP addresses and not the ports, via SNAT/DNAT prefix mapping. The part it lacks, for very practical reasons, is support for attempting to create permanent unique mappings from a larger number of unconstrained internal IP:port combinations to a smaller number of external ones.)

nybble41@programming.dev · 8 months ago

What “increased risks as far as csam”? You’re not hosting any yourself, encrypted or otherwise. You have no access to any data being routed through your node, as it’s encrypted end-to-end and your node is not one of the endpoints. If someone did use I2P or Tor to access CSAM and your node was randomly selected as one of the intermediate onion routers there is no reason for you to have any greater liability for it than any of the ISPs who are also carrying the same traffic without being able to inspect the contents. (Which would be equally true for CSAM shared over HTTPS—I2P & Tor grant anonymity but any standard password-protected web server with TLS would obscure the content itself from prying eyes.)

nybble41@programming.dev · 8 months ago

I’m fairly certain that last one is UB in C. The result of an assignment operator is not an lvalue, and even if it were it’s UB (at least in C99) to modify the stored value of an object more than once between two adjacent sequence points. It might work in C++, though.

nybble41@programming.dev · edit-2 8 months ago

No, that’s not how I2P works.

First, let’s start with the basics. An exit node is a node which interfaces between the encrypted network (I2P or Tor) and the regular Internet. A user attempting to access a regular Internet site over I2P or Tor would route their traffic through the encrypted network to an exit node, which then sends the request over the Internet without the I2P/Tor encryption. Responses follow the reverse path back to the user. Nodes which only establish encrypted connections to other I2P or Tor nodes, including ones used for internal (onion) routing, are not exit nodes.

Both I2P and Tor support the creation of services hosted directly through the encrypted network. In Tor these are referred to as onion services and are accessed through *.onion hostnames. In I2P these internal services (*.i2p or *.b32) are the only kind of service the protocol directly supports—though you can configure a specific I2P service linked to a HTTP/HTTPS proxy to handle non-I2P URLs in the client configuration. There are only a few such proxy services as this is not how I2P is primarily intended to be used.

Tor, by contrast, has built-in support for exit nodes. Routing traffic anonymously from Tor users to the Internet is the original model for the Tor network; onion services were added later. There is no need to choose an exit node in Tor—the system maintains a list and picks one automatically. Becoming a Tor exit node is a simple matter of enabling an option in the settings, whereas in I2P you would need to manually configure a proxy server, inform others about it, and have them adjust their proxy configuration to use it.

If you set up an I2P node and do not go out of your way to expose a HTTP/HTTPS proxy as an I2P service then no traffic from the I2P network can be routed to non-I2P destinations via your node. This is equivalent to running a Tor internal, non-exit node, possibly hosting one or more onion services.

nybble41@programming.dev · 8 months ago

It is not true that every node is an exit node in I2P. The I2P protocol does not officially have exit nodes—all I2P communication terminates at some node within the I2P network, encrypted end-to-end. It is possible to run a local proxy server and make it accessible to other users as an I2P service, creating an “exit node” of sorts, but this is something that must be set up deliberately; it’s not the default or recommended configuration. Users would need to select a specific I2P proxy service (exit node) to forward non-I2P traffic through and configure their browser (or other network-based programs) to use it.

nybble41@programming.dev · 9 months ago

Examples of local commands I might run in tmux could include anything long-running which is started from the command line. A virtual machine (qemu), perhaps, or a video encode (ffmpeg). Then if I need to log out or restart my GUI session for any reason—or something goes wrong with the session manager—it won’t take the long-running process with it. While the same could be done with nohup or systemd-run, using tmux allows me to interact with the process after it’s started.

I also have systems which are accessed both locally and remotely, so sometimes (not often) I’ll start a program on a local terminal through tmux so I can later interact with it through SSH without resorting to x11vnc.

nybble41@programming.dev · 9 months ago

Not the GP but I also use tmux (or screen in a pinch) for almost any SSH session, if only as insurance against dropped connections. I occasionally use it for local terminals if there is a chance I might want a command to outlive the current graphical session or migrate to SSH later.

Occasionally it’s nice to be able to control the session from the command line, e.g. splitting a window from a script. I’ve also noticed that wrapping a program in tmux can avoid slowdowns when a command generates a lot of output, depending on the terminal emulator. Some emulators will try to render every update even if it means blocking the output from the program for the GUI to catch up, rather than just updating the state of the terminal in memory and rendering the latest version.

nybble41@programming.dev · 10 months ago

MongoDB is under the Server Side Public License (SSPL) which is not an Open Source license.