Unifying the known and unknown microbial coding sequence space