A lot of the times, the responses of the R1 come in the form of numbered lists or key points. I get that it’s probably meant to be organized, but honestly, it feels pretty unnatural. When I’m looking for information, I prefer it to come across more like how a person would actually talk.
In my opinion It would be much nicer if the rabbit could combine the information in a more conversational way, like a human would when explaining something. Instead of breaking things down into 1, 2, 3 points, it could just flow more naturally, like a friend telling you what they found out. Or responding with a single refinement question. It would make the interaction feel a lot smoother and engaging.
OpenAI’s voice mode started out with numbering aswell but now is way more conversational and nice to conversate with.
Example:
= ‘To start off…’
= ‘Another…’
= ‘Lastly…’
I think a voice OS should avoid using patterns we use for written (structured) information. Or use a stuctured format on screen while the voice goes over it in a more conversational way.
I have some other ideas. The Rabbit R1 can be connected to a computer. Through software, it can be connected under the same Wi-Fi. Then, charge the R1. After that, the R1 can accompany you to do everything without the need to call it or press the PTT key to ask questions. For example, it can give you suggestions when you are playing games. After watching a movie or video, the R1 will summarize. When using office software, if you are not operating fast enough, the R1 will also prompt you with some shortcuts for operation or check if there are any errors and then prompt you.
Sorry, I didn’t understand what you said just now. My English is not very good. I have to reply with translation software, so it’s very bad. Now I understand what you mean. However, it will take some time and I will answer your question.
These are what I think natural language processing is.
A: “I’ve been so tired from work recently. I want to find a place to relax. Any recommendations?”
R1: “Hey, then I must give you a good introduction to an awesome place. There is a place called [name of scenic spot]. If you go there, you will definitely be amazed by its beauty. When you look up, there are blue skies and white clouds. Surrounded by green mountains and clear waters, the scene is just like a painting. Let’s talk about the characteristics here first. There are ancient buildings there, full of historical charm. Every stone seems to be telling stories of the past. Walking on those ancient streets, you will feel like you have traveled back in time all of a sudden. In addition, the food here is also superb. The local specialty [name of food] tastes extremely good. Take a bite, and the rich fragrance spreads in your mouth, leaving you with an endless aftertaste. Finally, the people here are extremely warm and hospitable. They will warmly welcome every tourist and make you feel the warmth of home. So, if you want to find a place to relax, you might as well come here and take a look. I believe you will definitely fall in love with this place!”
A: “Sounds really good. Is it convenient to live there?”
R1: “Don’t worry. Daily life here is very convenient. There are various small stores that can meet your daily needs, and the prices are relatively reasonable. The transportation around is also quite convenient. Whether you want to go to other places for a stroll or just want to walk around near the scenic spot, it’s all very convenient. Here, you can slow down, enjoy a leisurely daily life, and relax yourself well.”
A: “Recently, I want to find a fun place to travel, but I’m so conflicted and don’t know where to go.”
R1: “Hey, then listen up! I’ll recommend an extremely cool tourist attraction to you. This place is called [specific attraction name]. As soon as you step into there, you will immediately be shocked by its beauty. Under the blue sky and white clouds, surrounded by green mountains and clear waters, it’s just like a painting. Let’s talk about the characteristics here. There are ancient buildings there, and every place is full of historical charm. Those stones seem to be telling stories of the past. When you walk on those ancient streets, that feeling is just like traveling back in time. In addition, the food here is also superb. The local specialty [food name] tastes simply wonderful. As soon as you take a bite, that rich fragrance spreads in your mouth, leaving you with an endless aftertaste. Finally, it must be mentioned that the people here are very warm and hospitable. They will warmly welcome every tourist and make you feel the warmth of home. So, if you are looking for a fun place, you might as well come here and take a look. I believe you will definitely fall in love with this place!”
A: “Sounds really good. Is this place far away?”
R1: “It depends on where you start from. But no matter how far or near it is, I think this place is worth a visit. When you truly stand on that beautiful land, feel the charm of ancient buildings, taste delicious food, and be surrounded by warm people, you will think that everything is worth it.”
The structured, direct approach, as a technical person myself, it’s what avoids the noise of conversational fillers. Structured lists and key points work well for me because they deliver the information quickly and efficiently.
If there’s a way to integrate both structured responses and a conversational touch without losing the essence of concise information, that could be ideal. For now, though, sticking with a straightforward format is what suits me best.
I understand what you are saying.
I’m also interested in the possibility of combinging the two. For example not reading the search result verbatim but creating a more natural way to combine everything that is found while still showing the results in a more structured way on screen.
What do you think about ChatGPTs voicemode?
I made this topic because I was showing the R1 to some friends and it was quite uncomfortable to hear it respond with some simple recommendations as a numbered list. (It’s quite robotic).
The only thing withholding me from picking up the R1 more is that it doesn’t really work with the affordances I expect in a conversation. It is way easier to look something up on another device and take follow up actions there.
I would expect a voice assistent to assist with thinking things through and trying to understand the context instead of mostly working like a searchengine.
Less like a pokedex more like Samantha from the movie her.
I see a future where devices will continue to rely on current interfaces to present direct, structured information, while voice interactions will excel in offering more fluid, conversational exchanges. Rather than outputting rigid, structured data, voice interfaces should focus on delivering insights in a natural, context-aware manner, making complex data feel more accessible and intuitive for users. This balance will create an experience that feels both efficient and human-centric.
I guess Rabbit tries both by providing some conclusion in the end but I think it would work better if what was being said would be more conversational while providing more stuctured data as a list you can read yourself.
For example:
User: ‘I’d love to eat some [dish] i could make it myself’
R1: 'I’ve found some recipes for [dish]… They all take about 20 to 35 minutes to prepare. (Recipes are on screen in a numbered list you can fold open) Let me know if you want me to help you walk through one of them. Or if I can help in any other way.
I’ve noticed that rabbit not always reads what is on screen verbatim! I’m all for that! But multiple items still get a numbers creating an unnatural conversation
I’m not sure where the other topic is but this makes me think of a topic where you can have different personas. Some people sometimes want concise and list oriented, other times relaxed, informal, and conversational. Why can’t you just ask r1 to take on the role until told otherwise?